The Emmy-award-winning Moving Picture Experts Group (MPEG), the committee that has developed the MP3, MPEG-2, MPEG-4 and a host of other standards that have transformed and enriched the way humans interact with media, will hold its 100th meeting in Geneva, Switzerland on 30 April to 4 May 2012.
On the 2nd of May MPEG will hold the “MPEG 100 Event” to celebrate close to a quarter of century of intense activity that has seen thousands of digital media experts from tens of countries and hundreds of companies working collaboratively to advance the frontiers of technology.
The event will be attended by top-ranking officials from the International Organization for Standardization (ISO) and the International Electrotechnical Committee (IEC), the organizations co-sponsoring the Joint ISO/IEC Technical Committee (JTC 1) on Information Technology under which MPEG operates, the International Telecommunication Union (ITU) with which MPEG has developed two video compression standards and is currently developing the High Efficiency Video Coding (HEVC) standard, and the World Intellectual Property Organization.
Digital Media have brought users a revolution in the way media are created, distributed and consumed with profound ramifications in industry, society and individuals. Digital media are now an integral part of billions of people life, making it better, more interconnected and social.
The MPEG 100 event will be an important opportunity to confirm that international organisations maintain close cooperation in charting the future of digital media.
Showing posts with label mpeg. Show all posts
Showing posts with label mpeg. Show all posts
Thursday, 12 April 2012
Tuesday, 26 April 2011
New transport protocols for a better user experience
In the 1980s the telecom industry decided they needed a “broadband standards” and started defining some protocols under the project name “Asynchronous Transfer Mode”. Today people know ATM for something completely different because the project, which actually led to significant deployment in the telecom gears of various countries, came to a stop because of a competing technology called Internet Protocol.
A decade later IP, that did not require an access speed of 155 Mbit/s like ATM, started being deployed with a bitrate that on average did not even reach 3 orders of magnitude less the ATM’s. IP seemed to provide the level of speed that could make customers happy keeping the Plain Old Telephone System (POTS) in place while ATM required reaching millions of subscribers homes with optical fibers.
No one can blame telcos for trying to save trillions of dollars of optical fibers for the less costly Asymmetric Digital Subscriber Line (ADSL). The reality, though, is that our society is more and more video dependent but the fixed telecommunication infrastructure cannot provide the bandwidth that its users require. A lot of talk is being made these days around the “Next Generation Network” (NGN) acronym and, in due time, something is bound to come out, but very little prospects exist for the mobile network which is squeezed between a terrestrial broadcasting industry that sticks to its Ultra High Frequency (UHF) legacy while the need to carry video on mobile networks multiplies by the day.
Video is a strange beast. From time to time I regularly receive the question: How many bit/s are required to transmit video? My regular answer is: as many as you want, even no bits at all. I agree that some may see this answer as non-collaborative, but it contains a profound truth, namely that video is a really flexible beast because you can decide how many bit/s you use to transmit video. No matter how few bits you use, your correspondent will always see “something”.
Operators have exploited this feature to cope with the wide dynamics of networks characteristics. If transmitter is informed that receiver is unable to receive all the bits that it needs to decode a video, it can switch to a version of the video encoded at a lower bitrate. The user at the receiving side will see a less crisp picture and he may complain about the shortsightedness of telcos that did not invest in ATM (if they had done it, what would the phone bill be today?), but that is still better than a picture that keeps on freezing.
The problem is that operators has independently decided to use their own transmitter-receiver protocols. This was acceptable at a time when video was a past time of few, but it is no longer a solution today when video is so pervasive.
MPEG has spotted this problem and is close to releasing a new standard called DASH. The acronyms stands for Dynamic Adaptive Streaming over HTTP and is almost self-explanatory. The pervasive HyperText Transport Protocol is used to stream video but the bitrate used is dynamically adapted to network conditions using a standard protocol that any implementer can use to build interoperable solutions.
See a technical explanation at http://mpeg.chiariglione.org/technologies/mpeg-b/mpb-dash/index.htm
A decade later IP, that did not require an access speed of 155 Mbit/s like ATM, started being deployed with a bitrate that on average did not even reach 3 orders of magnitude less the ATM’s. IP seemed to provide the level of speed that could make customers happy keeping the Plain Old Telephone System (POTS) in place while ATM required reaching millions of subscribers homes with optical fibers.
No one can blame telcos for trying to save trillions of dollars of optical fibers for the less costly Asymmetric Digital Subscriber Line (ADSL). The reality, though, is that our society is more and more video dependent but the fixed telecommunication infrastructure cannot provide the bandwidth that its users require. A lot of talk is being made these days around the “Next Generation Network” (NGN) acronym and, in due time, something is bound to come out, but very little prospects exist for the mobile network which is squeezed between a terrestrial broadcasting industry that sticks to its Ultra High Frequency (UHF) legacy while the need to carry video on mobile networks multiplies by the day.
Video is a strange beast. From time to time I regularly receive the question: How many bit/s are required to transmit video? My regular answer is: as many as you want, even no bits at all. I agree that some may see this answer as non-collaborative, but it contains a profound truth, namely that video is a really flexible beast because you can decide how many bit/s you use to transmit video. No matter how few bits you use, your correspondent will always see “something”.
Operators have exploited this feature to cope with the wide dynamics of networks characteristics. If transmitter is informed that receiver is unable to receive all the bits that it needs to decode a video, it can switch to a version of the video encoded at a lower bitrate. The user at the receiving side will see a less crisp picture and he may complain about the shortsightedness of telcos that did not invest in ATM (if they had done it, what would the phone bill be today?), but that is still better than a picture that keeps on freezing.
The problem is that operators has independently decided to use their own transmitter-receiver protocols. This was acceptable at a time when video was a past time of few, but it is no longer a solution today when video is so pervasive.
MPEG has spotted this problem and is close to releasing a new standard called DASH. The acronyms stands for Dynamic Adaptive Streaming over HTTP and is almost self-explanatory. The pervasive HyperText Transport Protocol is used to stream video but the bitrate used is dynamically adapted to network conditions using a standard protocol that any implementer can use to build interoperable solutions.
See a technical explanation at http://mpeg.chiariglione.org/technologies/mpeg-b/mpb-dash/index.htm
Leonardo Chiariglione
Wednesday, 9 March 2011
SAF, the aggregation of LASeR and audiovisual material
SAF (Simple Aggregation Format) is part of the LASeR standard defining tools to fulfill all the requirements of rich-media service design at the interface between scene representation and transport mechanisms. SAF features the following functionality:
- simple aggregation of any type of media streams (MPEG or non-MPEG streams), resulting in a SAF stream with a low overhead multiplexing schema for low bandwidth networks,
- and possibility to cache SAF streams.
The result of the multiplexing of media streams is a SAF stream which can be delivered over any delivery mechanism: download-and-play, progressive download, streaming or broadcasting.
The purpose of the LASeR Systems decoder model is to provide an abstract view of the behaviour of the terminal. It may be used by the sender to predict how the receiving terminal will behave in terms of buffer management and synchronization when decoding data received in the form of elementary streams. The LASeR systems decoder model includes a timing model and a buffer model. The LASeR systems decoder model specifies:
- the conceptual interface for accessing data streams (Delivery Layer),
- decoding buffers for coded data for each elementary stream,
- the behavior of elementary stream decoders,
- composition memory for decoded data from each decoder, and
- the output behavior of composition memory towards the compositor.
Each elementary stream is attached to one single decoding buffer.
A multimedia presentation is a collection of a scene description and media (zero, one or more). A media is an individual audiovisual content of the following type: image (still picture), video (moving pictures), audio and by extension, font data. A scene description is constituted of text, graphics, animation, interactivity and spatial, audio and temporal layout. The sequence of a scene description and its timed modifications is called a scene description stream. A scene description stream is called a LASeR Stream.
Modifications to the scenes are called LASeR Commands. A command is used to act on elements or attributes of the scene at a given instant in time. LASeR Commands that need to be executed at the same time are grouped into one LASeR Access Unit (AU)
A scene description specifies four aspects of a presentation:
- how the scene elements (media or graphics) are organised spatially, e.g. the - spatial layout of the visual elements;
- how the scene elements (media or graphics) are organised temporally, i.e. if and how they are synchronised, when they start or end;
- how to interact with the elements in the scene (media or graphics), e.g. when a user clicks on an image;
- and if the scene is changing, how the scene changes happen.
- simple aggregation of any type of media streams (MPEG or non-MPEG streams), resulting in a SAF stream with a low overhead multiplexing schema for low bandwidth networks,
- and possibility to cache SAF streams.
The result of the multiplexing of media streams is a SAF stream which can be delivered over any delivery mechanism: download-and-play, progressive download, streaming or broadcasting.
The purpose of the LASeR Systems decoder model is to provide an abstract view of the behaviour of the terminal. It may be used by the sender to predict how the receiving terminal will behave in terms of buffer management and synchronization when decoding data received in the form of elementary streams. The LASeR systems decoder model includes a timing model and a buffer model. The LASeR systems decoder model specifies:
- the conceptual interface for accessing data streams (Delivery Layer),
- decoding buffers for coded data for each elementary stream,
- the behavior of elementary stream decoders,
- composition memory for decoded data from each decoder, and
- the output behavior of composition memory towards the compositor.
Each elementary stream is attached to one single decoding buffer.
A multimedia presentation is a collection of a scene description and media (zero, one or more). A media is an individual audiovisual content of the following type: image (still picture), video (moving pictures), audio and by extension, font data. A scene description is constituted of text, graphics, animation, interactivity and spatial, audio and temporal layout. The sequence of a scene description and its timed modifications is called a scene description stream. A scene description stream is called a LASeR Stream.
Modifications to the scenes are called LASeR Commands. A command is used to act on elements or attributes of the scene at a given instant in time. LASeR Commands that need to be executed at the same time are grouped into one LASeR Access Unit (AU)
A scene description specifies four aspects of a presentation:
- how the scene elements (media or graphics) are organised spatially, e.g. the - spatial layout of the visual elements;
- how the scene elements (media or graphics) are organised temporally, i.e. if and how they are synchronised, when they start or end;
- how to interact with the elements in the scene (media or graphics), e.g. when a user clicks on an image;
- and if the scene is changing, how the scene changes happen.
Mattia Donna Bianco
Friday, 4 March 2011
LASeR Standard
LASeR (Lightweight application scene representation ) is the MPEG RichMedia standard dedicated to the mobile, embedded and consumer electronics industries. LASeR provides a fluid user experience of enriched content, including Audio, Video, Text, and Graphics on constrained networks and devices.
LASeR standard is specified in the MPEG-4 Part 20.
The LASeR standard specifies the coded representation of multimedia presentations for rich media services. In the LASeR specification, a multimedia presentation is a collection of a scene description and media (zero, one or more). A media is an individual audiovisual content of the following type: image (still picture), video (moving pictures), audio and by extension, font data. A scene description is composed of text, graphics, animation, interactivity and spatial and temporal layout.
A LASeR scene description specifies four aspects of a presentation:
- how the scene elements (media or graphics) are organized spatially, e.g. the spatial layout of the visual elements;
- how the scene elements (media or graphics) are organized temporally, i.e. if and how they are synchronized, when they start or end;
- how to interact with the elements in the scene (media or graphics), e.g. when a user clicks on an image;
- and if the scene is changing, how these changes happen.
LASeR handles access units, i.e. self-contained chauncks of data, which may be adapted for transmission over a variety of protocols. LASeR streams may be packaged with some or all of their related media into files of the ISO base media file format family (e.g. MP4) and delivered over reliable protocols.
LASeR :
- Brings smart and pleasurable navigation within streamed and real-time AV contents,
- Is compliant with existing business models, and
- Allows an increased ARPU(Average Revenue Per Unit) by boosting service subscription thanks to interactivity.
Mattia Donna Bianco
Labels:
audio,
laser,
mobile,
mpeg,
multimedia,
standards,
stream,
technology,
video
Tuesday, 8 February 2011
What is a machine-readable license?
In recent years the use of digital technologies has increased dramatically, we can find these new technologies in all those contexts in which media content is created, distributed and consumed. Nowadays, for example, is possible to buy a single song on the internet, using only digital technologies. The broad use of these technologies has created a clash between those who create content, those who distribute them and who consume them.
In order to regulate the use and distribution of digital content licenses may be employed.
The license is a tool that aims to express what the users can and can not do with the licensed content (a video, a song, an ebook...).
To be machine-readable, licenses must be expressed in a particular way. For this reason a language called Rights Expression Language (REL) has been formalized .
One of the most important REL is the MPEG REL (MPEG-21 part 5). MPEG REL adopts a simple and extensible data model for many of its key concepts and elements. This data model consists of four entities and the relationship among entities. This basic relationship is defined by the MPEG REL assertion “grant”. The structure of the MPEG REL grant consists of the following: the principal to whom the grant is issued, the right that the grant specifies, the resource to which the right in the grant applies and finally the conditions that must be met before the right can be exercised.
In wim.tv the REL model is used for all transactions, be they B2C and B2B. The first is used between WebTvs and End Users, the latter between other actors. In these transaction it is possible to express payment and reissuing conditions.
Edoardo Radica
Monday, 10 January 2011
A roadmap to converging video services
Despite the rosy pictures we are often shown of encroachment of new media in the TV turf and the support of statistical evidence suggesting that more people spend more time with non-TV video, TV is as healthy as ever. In a recent Nielsen report Americans are said to have watched more TV in 2010 than ever before: total viewing of broadcast networks and basic cable channels is up ~1 percent, i.e. ~34 hours per person per week.
“Conservative” extensions of TV to the web like Hulu, Netflix or Apple TV are reported to fare rather well. On the other hand “innovative” attempts at integrating the television and “web video” experiences, like Google TV, receive mixed reports and see their deployment delayed.
The issue is further complicated by the underground battle around the enabling technologies to be adopted for streaming video to the end user via the internet. In the “analogue TV” age Consumer Electronics (CE) has thrived by adhering to established standards. In the now consolidated “digital TV” age CE has kept on thriving based on established standards. Should the “TV on the web” age be dominated by a handful of behemoths brandishing their technologies as a weapon to preserve and extend their walled gardens?
Judging from the number of initiatives addressing the need for standards in this space, one would say that the relevant industries do think that proprietary technologies should not be the only game in town. Unfortunately most initiatives have issued or are in the process of issuing specifications that appear to be driven by the desire of industries to protect their existing businesses by adding new features while keeping out potential new competitors. Whether this is what consumers are interested in is another story that may very well not be in their priority list.
ISO/IEC JTC 1/SC 29/ WG 11 (MPEG) has been working for the last few years – and keeps on doing so – to develop the key technologies that will enable, as done for digital TV, the creation of a level play field on which the third generation of CE can flourish. Some of these technologies target:
- New video and audio compression for more rewarding user experiences while keeping down the bitrate
- Media composition and presentation
- More attractive ways for the user to interact with services
- More effective ways to deliver content to end users when network is unreliable
- Multichannel distribution of content
- New ways to do business with content
This collection of basic technologies is very important for a smooth transition from “digital TV” to “TV on the web” based on standards. To make this happen, however, industry needs comprehensive specifications that integrate the technologies so that they can be seamlessly integrated in products and deployed to provide interoperable services.
In 2008 the Digital Media Project (DMP), an industry association based in Geneva, is in the process of launching a new project on “Digital media platform for the 2nd decade of the 21st century” (P21-2). The goal of this project is to integrate all technologies that are required to provide a solution that is attractive for consumers, profitable for content creators, secure for service providers and rewarding for device manufacturers.
A precursor of P21-2 is wim.tv, a service on the web that lets different types of entrepreneurs do business with video and advertisement content. wim.tv is enabled by CEDEO’s Platform for Digital Asset Trading (PDAT), designed to offer users all services required to do business with video content on the web effectively and profitably, e.g.
- Describe content
- Negotiate terms
- Request/generate/process events
- Issue licences
- Associate content/ads
- Stream video securely
- Interact with content
- Pay/cash
PDAT is an early implementation of the emerging MPEG-M standard (ISO/IEC 23006 Multimedia Service Platform Technologies). Its modular architecture allows for the easy replacement and introduction of existing and new modules to extend the range of services offered to its users.
Currently PDAT supports the following browsers: IE, Firefox, Chrome and Safari, running on Android, Linux, Mac OS (10.5 onward) and Windows (XP onward). The wim.tv player is a PDAT plugin.
Wim.tv is an ideal platform for the convergence of television services. It is based on international standards, has a growing community and its player is easily portable in such environments as Web, IP and mobile TV.
Initiatives such as wim.tv can provide the video ecosystem the means to move to the next level because of the existence of standard API to access services.
Wim.tv is an ideal platform for the convergence of television services. It is based on international standards, has a growing community and its player is easily portable in such environments as Web, IP and mobile TV.
Initiatives such as wim.tv can provide the video ecosystem the means to move to the next level because of the existence of standard API to access services.
Leonardo Chiariglione
Subscribe to:
Posts (Atom)
