正则表达式提取网址、标题、图片等一例 .Net Asp Javascript/Js 的实现

转自
在一些抓取、过滤等情况下，正则表达式regularexpression的优势是很明显的。
例如，有如下的字符串：
liahref=http://www.abcxyz.com/something/article/143.htmtitle=FCKEditor高亮代码插件测试spanclass=article-date[09/11]/spanFCKEditor高亮代码插件测试/a/li
现在，需要提取href后面的网址，[]内的日期，和链接的文字。
下面给出C#，ASP和Javascript的实现方式
C#的实现
stringstrHTML=lia\href=http://www.abcxyz.com/something/article/143.htm\title=\FCKEditor高亮代码插件测试\spanclass=\article-date\[09/11]/spanFCKEditor高亮代码插件测试/a/li;stringpattern=http://([^\\s]+)\.+?span.+?\\[(.+?)\\].+?(.+?);Regexreg=newRegex(pattern,RegexOptions.IgnoreCase);MatchCollectionmc=reg.Matches(strHTML);if(mc.Count0){foreach(Matchminmc){Console.WriteLine(m.Groups[1].Value);Console.WriteLine(m.Groups[2].Value);Console.WriteLine(m.Groups[3].Value);}}
ASP的实现
%Dimstr,reg,objMatchesstr=liahref=http://localhost/Z-Blog18/article/143.htmtitle=FCKEditor高亮代码插件测试spanclass=article-date[09/11]/spanFCKEditor高亮代码插件测试/a/liSetreg=newRegExpreg.IgnoreCase=Truereg.Global=Truereg.Pattern=http://([^\s]+).+?span.+?\[(.+?)\].+?(.+?)SetobjMatches=reg.Execute(str)IfobjMatches.Count0ThenResponse.Write(网址：)Response.Write(objMatches(0).SubMatches(0))Response.Write(br)Response.Write(日期：)Response.Write(objMatches(0).SubMatches(1))Response.Write(br)Response.Write(标题：)Response.Write(objMatches(0).SubMatches(2))EndIf%
Javascript的实现
scripttype=text/javascriptvarstr='liahref=http://localhost/Z-Blog18/article/143.htmtitle=FCKEditor高亮代码插件测试spanclass=article-date[09/11]/spanFCKEditor高亮代码插件测试/a/li';varpattern=/http:\/\/([^\s]+).+?span.+?\[(.+?)\].+?(.+?)/gi;varmts=pattern.exec(str);if(mts!=null){alert(mts[1]);alert(mts[2]);alert(mts[3]);alert(mts[4]);}/script
资源：
||浏览()|(0)最近读者：网友评论：发表评论：姓名：*姓名最长为50字节网址或邮箱：(选填)内容：验证码：请点击后输入四位验证码，字母不区分大小写

文博客-9牛1毛

标签

博客归档

BlogUpp!

关注者

2009年5月2日星期六

正则表达式提取网址、标题、图片等一例 .Net Asp Javascript/Js 的实现

0 评论:

发表评论