当前位置:网站首页>Using chardet to detect web page coding
Using chardet to detect web page coding
2022-07-18 07:11:00 【Full stack programmer webmaster】
Hello everyone , I meet you again , I'm the king of the whole stack
Environmental Science :Win7_x64 + python3.4.3
Need to download first chardet And install , Download address :https://pypi.python.org/packages/source/c/chardet/chardet-2.3.0.tar.gz
install : Enter the unzipped directory , Execute... In the command window : Python setup.py install
Write a test python Script it (DetectURLCoding.py):
#coding:utf-8
'''''python 3.x'''
import sys
import urllib.request
import chardet
# take data write file fname
def writeFile(fname, data):
f = open(fname, "wb")
if f:
f.write(data)
f.close()
def blog_detect(blogurl):
''''' Check the encoding mode '''
try:
fp = urllib.request.urlopen(blogurl)
except Exception as e:
print(e)
print('download exception-[%s]' %blogurl)
return 0
blog = fp.read() # python3.x read the html as html code bytearray
fp.close()
#writeFile("t.html", blog)
# get encoding string
codedetect = chardet.detect(blog)['encoding']
print('%s <- %s' %(blogurl, codedetect))
return 1
if __name__=='__main__':
if len(sys.argv) == 1:
print('''''usage:
python DetectURLCoding.py http://xxx.com''')
else:
v = blog_detect(sys.argv[1])
print(v) # Why hovertree.comRunning results :
D:\profile\Desktop>PYTHON de.py http://hovertree.com/
http://hovertree.com/ <- utf-8
1
D:\profile\Desktop>PYTHON de.py http://photo.cankaoxiaoxi.com/roll10/2015/0318/709734.shtml
http://photo.cankaoxiaoxi.com/roll10/2015/0318/709734.shtml <- utf-8
1 web front end :http://www.cnblogs.com/roucheng/p/texiao.html
Publisher : Full stack programmer stack length , Reprint please indicate the source :https://javaforall.cn/120432.html Link to the original text :https://javaforall.cn
边栏推荐
猜你喜欢

Asymmetric encryption RSA and symmetric encryption AES project application

Programming exercises

The most important diagram of machine learning, how to select the model sklearn structure diagram

MySQL autoincrement, index, foreign key, other operations

毫米波雷达学习(五)——角度估计

7. MySQL -- basic syntax (III) DCL

枚举,你了解它吗?

Lifecycle: the foundation of lifecycle aware components - jetpack series (1)

08 semi automatic annotation of target detection data set

Yys connector with image recognition v2.0
随机推荐
TCP three handshakes and four waves diagram
传值、传引用、传指针
非法获利超百万,行业新风口正被破解侵蚀
Programming exercises
PHP大量数据循环时内存耗尽问题的解决方案
Basic usage of flask
*链表逆转
urllib. error. URLError: <urlopen error [Errno 11004] getaddrinfo failed>
LeeCode子数组异或查询
在弹外,企业使用RDS能不能像弹内一样多环境使用DMS呢?比如这些生产实例下面的数据库,我想要有多
8. MySQL -- trigger
Array and string assignment problem (not initialized when defining)
[interview must brush 101] hash
Excel导入导出注解通用版
枚举,你了解它吗?
TP5在线显示图片出现乱码问题
使用JDBC操作数据库时出现了SSL和time zone错误的解决办法
flowable 自定义属性实现和属性获取小记
请问通达信如何开户?请问手机开户股票开户安全吗?
RuntimeWarning: overflow encountered in long_ scalars h = 12.0 / (totaln * (totaln + 1)) * ssbn - 3