PHP RegEx Help...

SPeedY_B

I may actually be insane.
Joined
31 Mar 2002
Messages
15,807
I have a full HTML document stored in a variable named $buffer, within the document there are multiple instances of the following (The asterisks in the line above are random numbers.) :
Code:
<a href="[b]showMessage.do?mmsId=********&inboxItemId=********[/b]" title="View message">
I need to grab just the part in bold. I gather the easiest way of doing this is with preg_match_all(); ? Though as I'm useless with the expressions side of things, I could do with some help.
 
/showMessage\.do\?mmsId\=(0-9)*\&inboxItemID\=(0-9)*/

pop that in and see what you get.. I'm not too hot at regex without a guide of syntax but that should get you started.
 
Yep LordOfLA is sort of on the right track, cept you said you wanted to capture the href portion not just the IDs

Code:
/href=["'](showMessage\.do\?mmsId=[0-9]+\&inboxItemID\=[0-9]+)["']/
This will catch the URL provided its in an href attribute and wrapped by either single or double quotes

PHP:
preg_match_all('/href=["\'](showMessage\.do\?mmsId=[0-9]+\&inboxItemID\=[0-9]+)["\']/', $buffer, $matches);

then all the matches should be found in the $matches array, just print_r the var and see what you get.

Alternatively if you are in PHP5 then you could process it as XML and use an XPath string to get what you want.
 
Last edited by a moderator:
Ok, using this:
PHP:
preg_match_all('/href=["\'](showMessage\.do\?mmsId=[0-9]+\&inboxItemID\=[0-9]+)["\']/', $buffer, $matches);
print_r($matches);
Produces this:
Code:
Array
(
    [0] => Array
        (
        )

    [1] => Array
        (
        )

)
I've made the $buffer print out at the top of the page before using the preg function so I can check what's there manually, and there are four instances (Two unique, printed twice each) of the string I mentioned, so I'm not sure why it's producing what looks like empty arrays?
 
Just tried this...
PHP:
preg_match_all("/<table>[\s\w\/<>=\\\"]*<\/table>/", $buffer, $matches);
...as there's only a single table in the entire page. Yet it doesn't produce the expected result.

When shoving in...
PHP:
$buffer = 'Hello, rah rah. <table>Niff this is <b>a test</b></table> nooch nooch';
I'm returned exactly what I'd expect. Not quite sure what's going on.
 
I've deviated into a separate test file now as previously I was grabbing the required page with cURL, so I've snipped some of the page containing bits of the output I need.
PHP:
<?php
$page = <<<EOF

		Start of document removed.

		<table>
			<tr class="tableheadCheckbox">
				<th> </th>
				<th>Preview</th>




				<th><a href="showInbox.do?sortBy=subject&sortOrder=asc"
				       title="Click here to sort by subject">Subject</a></th>



				<th><a href="showInbox.do?sortBy=from&sortOrder=asc"
				       title="Click here to sort by originator">From</a></th>



				<th><a href="showInbox.do?sortBy=received&sortOrder=asc"
				       title="Click here to sort by receive date"><img src="images/tri1.gif" alt="Down arrow to show ascending sort order" /> Received</a></th>



				<th class="noBorderRight"><a href="showInbox.do?sortBy=expires&sortOrder=desc"
				       title="Click here to sort by expiration date">Expires</a></th>


			</tr>

			<tbody>


				<tr class="tableheadCheckbox">
					<td>
						<div class="moduleFrmBox">
							<input type="checkbox" name="selectedItems" value="22222222" title="Check this message">
						</div>
					</td>
					<td>
	          <a href="showMessage.do?mmsId=11111111&inboxItemId=22222222" title="View message">

	  		<img src="http://139.2.165.14/MacsService/Macs/ContentService/-removed-.jpg" alt="Thumbnail preview of user submitted image" border="0" />

	          </a>
					</td>
					<td>


	          <a href="showMessage.do?mmsId=11111111&inboxItemId=22222222" title="View message">


	            Har


	  				</a>

					</td>
					<td>

					  +4400000001

					</td>
					<td>




					  2008/03/14 19:17

					</td>
					<td class="noBorderRight">

	          			  2008/04/13 20:17

					</td>
				</tr>

				<tr class="tableheadCheckbox">
					<td>
						<div class="moduleFrmBox">
							<input type="checkbox" name="selectedItems" value="33333333" title="Check this message">
						</div>
					</td>
					<td>
	          <a href="showMessage.do?mmsId=52546530&inboxItemId=33333333" title="View message">

	  		<img src="http://139.2.165.14/MacsService/Macs/ContentService/-removed-.jpg" alt="Thumbnail preview of user submitted image" border="0" />

	          </a>
					</td>
					<td>


	          <a href="showMessage.do?mmsId=52546530&inboxItemId=33333333" title="View message">


	            FW:


	  				</a>

					</td>
					<td>

					  +440000000000

					</td>
					<td>




					  2008/03/10 18:56

					</td>
					<td class="noBorderRight">

	          			  2008/04/09 19:56

					</td>
				</tr>


			</tbody>
		</table>
		
		End of document removed.

EOF;
	$lines = explode("\n", $page); 	

	echo '<textarea style="width:110em;height:400px;">'; print_r($lines); echo '</textarea><br />';


	preg_match_all('/href=["\'](showMessage\.do\?mmsId=[0-9]+\&inboxItemID\=[0-9]+)["\']/', $line, $matches);


	preg_match_all("/<table>[\s\w\/<>=\\\"]*<\/table>/m", $page, $table_matches);

	echo '<strong>ID, etc...</strong><pre>'; print_r($matches); echo '</pre> <strong>Tables...</strong><pre>'; print_r($table_matches); echo '</pre>';
?>
Results in the following
Code:
ID, etc...
Array
(
    [0] => Array
        (
        )

    [1] => Array
        (
        )

)
Tables...
Array
(
    [0] => Array
        (
        )

)
 
I'll take a look at it when i wake up tomorrow morning. Just came back from sodering a whole bunch of LED's.
 
*prod*

Have you had a chance to look at this yet, X?

Edit: Actually, there's no rush, seems O2 have changed the way in which their site works. Seems to be totally ****ing broke for me at the moment, so this script is a bit useless at present. :(

Edit2: I take it back, sorted it again :p
 
Last edited:

Members online

No members online now.

Latest profile posts

Also Hi EP and people. I found this place again while looking through a oooollllllldddd backup. I have filled over 10TB and was looking at my collection of antiques. Any bids on the 500Mhz Win 95 fix?
Any of the SP crew still out there?
Xie wrote on Electronic Punk's profile.
Impressed you have kept this alive this long EP! So many sites have come and gone. :(

Just did some crude math and I apparently joined almost 18yrs ago, how is that possible???
hello peeps... is been some time since i last came here.
Electronic Punk wrote on Sazar's profile.
Rest in peace my friend, been trying to find you and finally did in the worst way imaginable.

Forum statistics

Threads
62,015
Messages
673,494
Members
5,621
Latest member
naeemsafi
Back