How does Facebook crawler work?
Content is most often shared to Facebook in the form of a web page. The first time someone shares a link, the Facebook crawler will scrape the HTML at that URL to gather, cache and display info about the content on Facebook like a title, description, and thumbnail image.
Apart from the webpage being directly shared on Facebook, there are other ways that can trigger a crawl of your webpage. For example, having any of Facebook's social plugins on the webpage can cause Facebook crawler to scrape that webpage.
The Facebook crawler needs to be able to access your content in order to scrape and share it correctly. Your pages should be visible to the crawler. If you require login or otherwise restrict access to your content, you'll need to whitelist Facebook crawler.
Note that Facebook crawler only accepts gzip and deflate encodings, so make sure your server uses the right encoding.
Your website should either generate and return a response with all required properties according to the bytes specified in the range header of the crawler request or it should ignore the Range header altogether.
Note that the Facebook crawler will scrape your page every 30 days and it only scraps the first 1 MB of a page, so any Open Graph properties need to be listed before that cutoff.
The Facebook crawler can be identified by either of these user agent strings:
facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
//or//
facebookexternalhit/1.1
//or//
Facebot