Skip to yearly menu bar Skip to main content

Workshop: Table Representation Learning

Alon Halevy - "Structured Data Inside and Out"

Alon Halevy


WebTables contain high-quality data that is relevant to many queries on search engines. Since they are embedded inside web pages, understanding the semantics of tables requires analyzing the text surrounding them on the page. This talk will begin by recalling some of the early challenges we faced with the WebTables Project at Google. I will then turn to a different kind of challenge at the intersection of structured and unstructured data, where the structured data is outside and the unstructured data is inside. For example, when modeling a set of events in a person’s life (or history of an enterprise or a culture), each event is described in text and other media, but the event is also associated with structured data such as time and location. Answering questions over such collections of data requires leveraging the structure in the data appropriately. In the second half of the will discuss the motivations, challenges and partial solutions to dealing with structured data that is on the outside.

Chat is not available.