糖心TV

Skip to main content Skip to navigation

Artificial Intelligence Events

Monday, December 03, 2012

Sun, Dec 02 Today Tue, Dec 04 Jump to any date

How do I use this calendar?

You can click on an event to display further information about it.

The toolbar above the calendar has buttons to view different events. Use the left and right arrow icons to view events in the past and future. The button inbetween returns you to today's view. The button to the right of this shows a mini-calendar to let you quickly jump to any date.

The dropdown box on the right allows you to see a different view of the calendar, such as an agenda or a termly view.

If this calendar has tags, you can use the labelled checkboxes at the top of the page to select just the tags you wish to view, and then click "Show selected". The calendar will be redisplayed with just the events related to these tags, making it easier to find what you're looking for.

 
-
Export as iCalendar
IAS seminar: George Gkotsis
Self-supervised automated wrapper generation for weblog data extraction
Data extraction from the web is notoriously hard. Of the types of resources available on the web, weblogs are becoming increasingly important due to the massive growth of the blogosphere, but remain poorly explored. Past approaches to data extraction from weblogs have often involved manual intervention and suffer from low scalability. This paper proposes a fully automated information extraction methodology for weblogs which integrates a set of relevant approaches based on the use of web feeds and processing of HTML for the extraction of weblog properties. The approach includes a model for generating a wrapper that exploits web feeds for deriving a set of extraction rules automatically. Instead of performing a pairwise comparison between posts, the model matches the values of the web feeds against their corresponding HTML elements retrieved from multiple weblog posts. It adopts a probabilistic approach for deriving a set of rules and automating the process of wrapper generation. An evaluation of the model is conducted on a collection of weblogs and the results show that the proposed technique enables robust extraction of weblog properties and can be applied across the blogosphere for various applications, such as improved information retrieval and more robust web preservation initiatives.

Placeholder

Let us know you agree to cookies