![]() Accueil Présentation du laboratoire Séminaire Equipe Activités
Publications Ressources Stages et Postdocs Nous contacter Rechercher |
Laboratoire BILab Le BILab (Business Intelligence Lab) est un laboratoire commun de recherche entre Télécom-ParisTech et EDF R&D sur le thème de l'informatique décisionnelle (Business Intelligence en anglais). ActualitésLa prochaine séance du séminaire sur la "Business Intelligence" (informatique décisionnelle) aura lieu le jeudi 7 juillet 2011 à Télécom ParisTech, 46 rue Barrault, Paris 13ème, de 15h à 17h, en amphi B310.Les exposés seront : Scalable storage for Map-Reduce applications: the BlobSeer approachPrésenté par : Alexandru Costan (INRIA) Résumé : Current Map-Reduce frameworks such as Hadoop have shown a series of limitations for data-intensive applications. The ANR Map-Reduce project aims to overcome them by enabling highly-scalable Map-Reduce-based data processing on various physical platforms such as clouds, desktop grids, or on hybrid infrastructures built by combining these two types of infrastructures. To meet this global goal, several critical aspects needs investigation: data storage and sharing architectures, scheduling, fault tolerance and security. The projects explores how combining these techniques can improve the behavior of Map-Reduce-based applications on the target large-scale infrastructures. In this talk we focus on the first problem: how to efficiently store and access very large binary data objects (BLOBs) in large-scale distributed environments. We consider data-intensive application scenarios where a large number of clients concurrently read, write and append data to huge BLOBs that are fragmented and distributed at a very large scale. We argue that scalability under heavy concurrency can be achieved by combining multiversioning with distributed metadata management. We introduce an approach called BlobSeer?, currently at INRIA Rennes - Bretagne Atlantique. Experiments with BlobSeer? as a storage backend in the Hadoop MapReduce? framework demonstrate substantial benefits to data-intensive workloads. We will describe current work on exploring how to efficiently exploit the BlobSeer? approach at different levels in cloud environments such as Nimbus (from Argonne National Labs) and Microsoft Azure. More info: http://mapreduce.inria.fr. Recent Advances Towards MapReduce on Desktop GridsPrésenté par : Gilles Fedak (INRIA) Résumé : Desktop Grids use the compute, network and storage resources from idle desktop PC's distributed over multiple-LAN's or the Internet to compute a large variety of resource-demanding distributed applications. Today, this type of computing platform, like SETI@Home, forms one of the largest distributed computing systems, and currently provides scientists with PeraFLOPS? from hundreds of thousands of hosts. Despite the attractiveness of this platform, the application domain is still limited to Bag-of-Tasks application with few IO. This talk will present the challenges of data-intensive applications in this context of massively distributed, volatile, heterogeneous, and network-limited resources and the recent advances in term of data distribution, data sharing and data processing. We will describe our approach to address these challenges: the BitDew? framework, a programmable environment for automatic and transparent data management on computational Desktop Grids. BitDew? relies on a specific set of meta-data to drive key data management operations, namely life cycle, distribution, placement, replication and fault-tolerance with a high level of abstraction. To illustrate our approach, an implementation of the MapReduce? programming model will be presented as well.
Webmaster Annie Danzart :: Pour rester informé des derniers changements
|