You machine is probably already comped.
Its easy in PHP, probably also easy in perl.
First pull the content,
You could grab data to re-assemble it or just dump the whole content into the table as is.
Then extract dom elements by classname.
https:// stackoverflow.com/questions/6366351/getting-dom-elements-by-classname#6366390
Insert the results of this into another table and create a cross reference index between the two tables.