Attended Solr SF meetup at Airbnb and listened to a very interesting talk by Mousom.
Airbnb uses Lucene for its search functionality and relies on its geo filter to find locations based on geography and uses a forward index to filter beyond that based on availability, pricing etc. Search requests are distributed across a cluster of search nodes, each node has the entire data set in a single box (a few million listings). Distribution of data across the various search nodes is handled by a pub sub mechanism rather than a conventional replication. All boxes are on Amazon and is approx C4 large box.
You can access the slides here.