举例抓取hao123上的搞笑图片及Gif动画的网址为例。
1.目标网址:
http://www.hao123.com/gaoxiao?pn=1
2.获取HTML数据。方法如下:
NSString *htmlString = [NSString stringWithContentsOfURL:[NSURL URLWithString:@"http://www.hao123.com/gaoxiao?pn=1"] encoding:NSUTF8StringEncoding error:nil];
3.分析网页内容,找到需要的资源链接前后的关键字符串。
目标网址资源前后关键字分别为:
前:
@"<img selector=\"pic\" img-src=\""
后:
@"\" src="
4.从htmlString中截取需要的字符串。方法如下:
为NSString添加一个Catalog
@interface NSString (MYNSStringExtensionMethods)
- (NSArray *)componentsSeparatedFromString:(NSString *)fromString toString:(NSString *)toString;
@end
@implementation NSString (MYNSStringExtensionMethods)
- (NSArray *)componentsSeparatedFromString:(NSString *)fromString toString:(NSString *)toString
{
if (!fromString || !toString || fromString.length == 0 || toString.length == 0) {
return nil;
}
NSMutableArray *subStringsArray = [[NSMutableArray alloc] init];
NSString *tempString = self;
NSRange range = [tempString rangeOfString:fromString];
while (range.location != NSNotFound) {
tempString = [tempString substringFromIndex:(range.location + range.length)];
range = [tempString rangeOfString:toString];
if (range.location != NSNotFound) {
[subStringsArray addObject:[tempString substringToIndex:range.location]];
range = [tempString rangeOfString:fromString];
}
else
{
break;
}
}
return subStringsArray;
}
@end
5.获取并输出资源地址
NSArray *urls = [htmlString componentsSeparatedFromString:@"<img selector=\"pic\" img-src=\"" toString:@"\" src="];
输出:
NSLog(@"find urls:%@", urls);
输出结果:
find urls: (
http://img.hao123.com/data/3_a43d768470ea5785e5bbf3ca2c81e4a7_430,
http://img6.hao123.com/data/3_c87ac28d85b361b5efc9654cdb24c745_430,
http://img0.hao123.com/data/3_759c73a935eb8c3ebae5646eb71b3028_0,
http://img.hao123.com/data/3_415b7834328e6a4fc70f50854828df22_0,
http://img5.hao123.com/data/3_e84664284f1cbf59eb364d147fc1610f_430
)
网友评论