Complex structured multimedia documents possess a rich variety of information appearing in different forms and combined under diverse schemata. Their analysis is a demanding operation calling for specific per-medium processing techniques to be developed, assembled and fused. This will enable multimedia document interpretation and adaptation in the context of an evolving domain application. In BOEMIE, the objective of a Methodology for Semantics Extraction from Multimedia Content ­as forseen in the DoW, p.44 ­ is to specify how information from the multimedia semantic model can be used to achieve semantic extraction from various modalities (text, image, video and audio) and to come up with an open architecture, which will communicate with the ontology evolution modules in WP4, accessing existing knowledge and providing back newly extracted information. This document describes the architectural and methodological choices that we believe will lead us to the fulfilment of the above ob jective.