用python的HTMLParser分析html页面

python里面有很多实用简单的工具,下面记录一下自己学习使用HTMLParser模块分析抓取的网页的例子:

下面是程序运行输出结果

    AdvancedSearch
    About
    News
    Documentation
    Download
    下载
    Community
    Foundation
    CoreDevelopment
    Help
    PackageIndex
    QuickLinks(2.7.2)
    Documentation
    WindowsInstaller
    SourceDistribution
    QuickLinks(3.2.2)
    Documentation
    WindowsInstaller
    SourceDistribution
    PythonJobs
    PythonMerchandise
    PythonWiki
    PythonInsiderBlog
    Python2or3?
    HelpMaintainWebsite
    HelpFundPython
    Non-EnglishResources
    PythonReleaseScheduleiCalCalendar
    Python3
    PyPIpackagename
    Results
    Rackspace
    IndustrialLightandMagic
    AstraZeneca
    Honeywell
    andmanyothers
    eWeek
    more...
    WebProgramming
    CGI
    Zope
    Django
    TurboGears
    XML
    Databases
    ODBC
    MySQL
    GUIDevelopment
    wxPython
    tkInter
    PyGtk
    PyQt
    ScientificandNumeric
    Physics
    Education
    pyBiblio
    SoftwareCarpentryCourse
    Networking
    Sockets
    Twisted
    SoftwareDevelopment
    Buildbot
    Trac
    Roundup
    IDEs
    GameDevelopment
    PyGame
    PyKyra
    3DRendering
    more...
    opensourcelicense
    Python2orPython3
    PythonSoftwareFoundation
    PyConconference
    Readmore
    downloadPythonnow
    O'ReillyOpenSourceConvention
    CallforProposals
    BestProgrammingLanguage
    PyConinChina
    IronPython2.7.1
    PyArkansas
    PyGotham
    Python3.2.2
    RSS
    WebsitemaintainedbythePythoncommunity
    hostingbyxs4all
    designbyTimParkin
    PythonSoftwareFoundation
    LegalStatements
Xiang Chao 19 February 2012
blog comments powered by Disqus
Fork me on GitHub