Parse an HTML Table with PHP
I recently was in the position where I needed to parse a table within an HTML file on a number of different pages. To save myself some time I wrote this simple script to handle the parsing programatically.
The script will work with most simple tables where the <th> tag has been used to define headers. It is unlikely to work with nested tables! Essentially, it worked for the purposes it was created for but your milage may vary!
function parseTable($html) { // Find the table preg_match("/<table.*?>.*?<\/[\s]*table>/s", $html, $table_html); // Get title for each row preg_match_all("/<th.*?>(.*?)<\/[\s]*th>/", $table_html[0], $matches); $row_headers = $matches[1]; // Iterate each row preg_match_all("/<tr.*?>(.*?)<\/[\s]*tr>/s", $table_html[0], $matches); $table = array(); foreach($matches[1] as $row_html) { preg_match_all("/<td.*?>(.*?)<\/[\s]*td>/", $row_html, $td_matches); $row = array(); for($i=0; $i<count($td_matches[1]); $i++) { $td = strip_tags(html_entity_decode($td_matches[1][$i])); $row[$row_headers[$i]] = $td; } if(count($row) > 0) $table[] = $row; } return $table; }
Download parseTable.php
Thank you! I have been all over google looking for a simple version of this that I can use. This is absolutely perfect! This is the first I found with good regex.
Thanks,
Derek
it rocks… thanks
please Download “parseTable.php” in some text format
thanks
Shouldn’t it be this way?
preg_match(“/(.*)/s”, $html, $table_html);
preg_match_all(‘/(.*?)/s’,$table_html[1],$matches);
whoa! it crashed my regexp )
cool! you are so wonderful~
it helps me a lot.
thank you very much!
How this works if you have a table in a table?
I think it doesn’t work…
</table
will return this exact text: "”
fix:
table 1
table 2
/table 2
/table1
will match: table2
fix: preg_match(“/(.*)?/s”, $html, $table_html);
Note the (.*)? instead of .*? so that the matches are put into $table_html. Thanks.
NM. Now, I see what Marat was saying. The first preg_match needs brackets.
How can i use this function. Can you give me an example please.
Thanks
Wow! after a 30 minutes of search i got something good.
Sarah, these are beautiful! Your family clearly have amazing genes.
Thanks for making me to get new thoughts about desktops. I also have belief that certain of the best ways to help keep your laptop in primary condition has been a hard plastic-type case, and also shell, that suits over the top of one’s computer. These kinds of protective gear tend to be model precise since they are manufactured to fit perfectly within the natural outer shell. You can buy them directly from the owner, or through third party places if they are readily available for your mobile computer, however not every laptop will have a spend on the market. Once again, thanks for your ideas.
Man I would love to have a job like that…
Andrea! That sounds amazing involving your kids in creative projects is the way to go. Sounds like creativity runs in the veins of your family. I’m super happy you’re enjoying the blog, thanks so much for your compliments comments like this are what makes blogging so so special. Have a wonderful weekend (and do share the outcome of your family collaboration!).
Manche schaffen es, andere nicht.
hello, where can I get the php file to download? the link here is sending me to a blank page
My PHP is a little rusty and I’m a novice for preg functions. Does anyone know a way to skip checking for TH tags if they don’t exist and/or just number the columns/indicies manually? Thanks