Panda has not gone away and sites are still suffering from not ranking in Google after the Panda algorithm. New sites are hit monthly, some sites are released in some way from its grips as well.
The thread talks about one technique for large sites to determine which sections of their sites are impacted by Panda.
The concept is to use XML Sitemaps and break the sitemap files into a logical structure of the web site. Then once Google processes all the files, Google will quickly show you how many pages within each sitemap file was indexed or not indexed.
One webmaster explained how he went about this technique:
The sector I was perfoming in. allowed me to created dozens of sitemaps of 100 pages each. No reason why any of the pages should not be indexed. I found some sitemaps with 0 listed then others from 25 up to the full 100. I then discovered trends. IE pages with similar title tags and URLS. (the on page content was considerably different, which is why I did not remove them initially)
I then did different experiments with each sitemap group, until I saw a recovery, then applied the solutions across the board.
The question I have and I am not certain of... I thought sites impacted by Panda, the pages are indexed but they don't rank as well. Meaning, if a page is not indexed, that is more of an issue with crawl budget and PageRank (or sever set up issues) versus Panda. Panda, the content has to be indexed for Google to know not to rank it well. Am I wrong?
Forum discussion at WebmasterWorld.
Image credit to BigStockPhoto for abstract image