• 数据密集型应用系统设计2021-9-5

    /

    MapReduce

    1. map (collect)
    2. reduce (fold or inject)

    cypher 查询语言是用来查询图状数据。

    RDF Resource Description Framework

    Sparql 查询三元储存的数据

    Datalog: cascalog 是用于查询 Hadoop大数据集的datalog 实现

    Datalog 类似于三元存储的数据,但是更通用一些,采用 谓语(主体,客体)的表达方式而不是(主体,谓语,客体)。 Example:
    name(namerica, ‘North America’).
    type(namerica, continent).

    datalog 是 prolog 的子集

  • Umich Website Spider

    /

    University of Michigan was planning to migrate its website content from Plone CMS to WordPress. I wrote a web spider to collect all posts from the original website during my intern to help the migration.

    umich web spider