- FrameNet Data
The FrameNet project is building a lexical database of English that is both human- and machine-readable, based on annotating examples of how words are used in actual texts. From the student's point of view, it is a dictionary of more than 10,000 word senses, most of them with annotated examples that show the meaning and usage. For the researcher in Natural Language Processing, the more than 170,000 manually
FrameNet is based on a theory of meaning called Frame Semantics, deriving from the work of Charles J. Fillmore and colleagues (Fillmore 1976, 1977, 1982, 1985, Fillmore and Baker 2001, 2010). The basic idea is straightforward: that the meanings of most words can best be understood on the basis of a semantic frame: a description of a type of event, relation, or entity and the participants in it. For example, the concept of cooking typically involves a person doing the cooking (Cook), the food that is to be cooked (Food), something to hold the food while cooking (Container) and a source of heat (Heating_instrument). In the FrameNet project, this is represented as a frame called Apply_heat, and the Cook, Food, Heating_instrument and Container are called frame elements (FEs) . Words that evoke this frame, such as fry, bake, boil, and broil, are called lexical units (LUs) of the Apply_heat frame. Other frames are more complex, such as Revenge, which involves more FEs (Offender, Injury, Injured_Party, Avenger, and Punishment) and others are simpler, such as Placing, with only an Agent (or Cause), a thing that is placed (called a Theme) and the location in which it is placed (Goal). The job of FrameNet is to define the frames and to annotate sentences to show how the FEs fit syntactically around the word that evokes the frame, as in the following examples of Apply_heat and Revenge:
- ... [Cook the boys] ... GRILL [Food their catches] [Heating_instrument on an open fire].
- [Avenger I] 'll GET EVEN [Offender with you] [Injury for this]!
In the simplest case, the frame-evoking word is a verb and the FEs are its syntactic dependents, as in the example above where boys is the subject of the verb grill, their catches is the direct object, and on an open fire is a prepositional phrase modifying grill, but LUs can also be event nouns such as retaliation, also in the Revenge frame:
- [ Punishment This attack was conducted] [Support in] RETALIATION [ Injury for the U.S. bombing raid on Tripoli...
or adjectives such as asleep in the Sleep frame:
- [Sleeper They] [Copula were] ASLEEP [Duration for hours]
The lexical entry for each LU is derived from such annotations, and specifies the ways in which FEs are realized in syntactic structures headed by the word.
Many common nouns, such as tree, hat or tower, usually serve as dependents which head FEs, rather than clearly evoking their own frames, so we have devoted less effort to annotating them, since information about them is available from other lexicons, such as WordNet (Miller et al. 1990). We do, however, recognize that such nouns also have a minimal frame structure of their own, and in fact, the FrameNet database contains slightly more nouns than verbs.
Formally, FrameNet annotations are sets of triples that represent the FE realizations for each annotated sentence, each consisting of a frame element name (for example, Food), a grammatical function (say, Object) and a phrase type (say, noun phrase (NP)). We can think of these three types of annotation on each FE as "layers", but the grammatical function and phrase-type layers are not displayed in the web-based report system, to avoid visual clutter. The downloadable XML version of the data includes these three layers (and several more not discussed here) for all of the annotated sentences, along with complete frame and FE descriptions, frame-frame relations, and lexical entries for each annotated LU. Most of the annotations are of separate sentences annotated for only one LU, but there are also a collection of texts in which all the frame-evoking words have been annotated; the overlapping frames provide a rich representation of much of the meaning of the entire text. The FrameNet team have defined more than 1,000 semantic frames and have linked them together by a system of frame relations, which relate more general frames to more specific ones and provide a basis for reasoning about events and intentional actions.
Because the frames are basically semantic, they are often similar across languages; for example, frames about buying and selling involve the FEs Buyer, Seller, Goods, and Money, regardless of the language in which they are expressed. Several projects are underway to build FrameNets parallel to the English FrameNet project for languages around the the world, including Spanish, German, Chinese, and Japanese, and frame semantic analysis and annotation has been carried out in specialized areas from legal terminology to soccer to tourism.