2004年11月13日

网络传输协议学习 TCP (UDP)协议
    tcp是面向连接的通信协议,采用超时重发以及流量控制等措施来保证传输的可靠性,但它不能保证实时要求,一般用于传输实时性不强的数据,udp是面向非连接的通信协议,不提供传输的可靠性,协议简单,效率高,在网络有一定的QoS保证的前提下适合大量数据的实时传送.
 
continue……….

2004年11月09日

“可移植、可扩展、开放源代码的框架并不是个新思想(您会想起 Emacs),但是由于它成熟、健壮和优雅的设计,Eclipse 带来了全新的动力。IBM 价值 4000 万美元的世界级软件在开放源代码领域的发布,给业界带来了久违的震撼”。

我不明白Eclipse的底层engine设计的优越性和优雅性,但我关心的是速度和performance,虽然用CDT可以让Eclipse支持C/C++的开发,但是作为一个由java写的IDE,无论如何其速度都是令人感到尴尬的,尤其是那些喜欢C/C++的开发人员,对速度更为敏感,—–我不想提JBuilder的笨重!

既然Eclipse的架构很好,为什么就没有牛人学习Eclipse的思想用C/C++对其改写呢,难道是迫于IBM的淫威?

2004年11月08日

Using libavformat and libavcodec: An Update

Martin B?hme (boehme@inb.uni-luebeckREMOVETHIS.de)

July 21, 2004

A few months ago, I wrote an article on using the libavformat and libavcodec libraries that come with ffmpeg. Since then, I have received a number of comments, and a new prerelease version of ffmpeg (0.4.9-pre1) has recently become available, adding support for seeking in video files, new file formats, and a simplified interface for reading video frames. These changes have been in the CVS for a while, but now is the first time we get to see them in a release. (Thanks by the way to Silviu Minut for sharing the results of long hours of studying the CVS versions of ffmpeg – his page with ffmpeg information and a demo program is here.)

In this article, I’ll describe only the differences between the previous release (0.4.8) and the new one, so if you’re new to libavformat / libavcodec, I suggest you read the original article first.

First, a word about compiling the new release. On my compiler (gcc 3.3.1 on SuSE), I get an internal compiler error while compiling the source file ffv1.c. I suspect this particular version of gcc is a little flaky – I’ve had the same thing happen to me when compiling OpenCV – but at any rate, a quick fix is to compile this one file without optimizations. The easiest way to do this is to do a make, then when the build hits the compiler error, change to the libavcodec subdirectory (since this is where ffv1.c lives), copy the gcc command to compile ffv1.c from your terminal window, paste it back in, edit out the “-O3″ compiler switch and then run gcc using that command. After that, you can change back to the main ffmpeg directory and restart make, and it should complete the build.

What’s New?

So what’s new? From a programmer’s point of view, the biggest change is probably the simplified for reading individual video frames from a video file. In ffmpeg 0.4.8 and earlier, data is read from the video file in packets using the routine av_read_packet(). Usually, the information for one video frame is spread out over several packets, and the situation is made even more complicated by the fact that the boundary between two video frames can come in the middle of two packets. Thankfully, ffmpeg 0.4.9 introduces a new routine called av_read_frame(), which returns all of the data for a video frame in a single packet. The old way of reading video data using av_read_packet() is still supported but deprecated – I say: good riddance.

So let’s take a look at how to access video data using the new API. In my original article (with the old 0.4.8 API), the main decode loop looked like this:

while(GetNextFrame(pFormatCtx, pCodecCtx, videoStream, pFrame)) { img_convert((AVPicture *)pFrameRGB, PIX_FMT_RGB24, (AVPicture*)pFrame, pCodecCtx->pix_fmt, pCodecCtx->width, pCodecCtx->height); // Process the video frame (save to disk etc.) DoSomethingWithTheImage(pFrameRGB); } 

GetNextFrame() is a helper routine that handles the process of assembling all of the packets that make up one video frame. The new API simplifies things to the point that we can do the actual reading and decoding of data directly in our main loop:

while(av_read_frame(pFormatCtx, &packet)>=0) { // Is this a packet from the video stream? if(packet.stream_index==videoStream) { // Decode video frame avcodec_decode_video(pCodecCtx, pFrame, &frameFinished, packet.data, packet.size); // Did we get a video frame? if(frameFinished) { // Convert the image from its native format to RGB img_convert((AVPicture *)pFrameRGB, PIX_FMT_RGB24, (AVPicture*)pFrame, pCodecCtx->pix_fmt, pCodecCtx->width, pCodecCtx->height); // Process the video frame (save to disk etc.) DoSomethingWithTheImage(pFrameRGB); } } // Free the packet that was allocated by av_read_frame av_free_packet(&packet); } 

At first sight, it looks as if things have actually gotten more complex – but that is just because this piece code does things that used to be hidden in the GetNextFrame() routine (checking if the packet belongs to the video stream, decoding the frame and freeing the packet). Overall, because we can eliminate GetNextFrame() completely, things have gotten a lot easier.

I’ve updated the demo program to use the new API. Simply comparing the number of lines (222 lines for the old version vs. 169 lines for the new one) shows that the new API has simplified things considerably.

Another important addition in the 0.4.9 release is the ability to seek to a certain timestamp in a video file. This is accomplished using the av_seek_frame() function, which takes three parameters: A pointer to the AVFormatContext, a stream index and the timestamp to seek to. The function will then seek to the first key frame before the given timestamp. All of this is from the documentation – I haven’t gotten round to actually testing av_seek_frame() yet, so I can’t present any sample code either. If you’ve used av_seek_frame() successfully, I’d be glad to hear about it.

Frame Grabbing (Video4Linux and IEEE1394)

Toru Tamaki sent me some sample code that demonstrates how to grab frames from a Video4Linux or IEEE1394 video source using libavformat / libavcodec. For Video4Linux, the call to av_open_input_file() should be modified as follows:

AVFormatParameters formatParams; AVInputFormat *iformat; formatParams.device = "/dev/video0"; formatParams.channel = 0; formatParams.standard = "ntsc"; formatParams.width = 640; formatParams.height = 480; formatParams.frame_rate = 29; formatParams.frame_rate_base = 1; filename = ""; iformat = av_find_input_format("video4linux"); av_open_input_file(&ffmpegFormatContext, filename, iformat, 0, &formatParams); 

For IEEE1394, call av_open_input_file() like this:

AVFormatParameters formatParams; AVInputFormat *iformat; formatParams.device = "/dev/dv1394"; filename = ""; iformat = av_find_input_format("dv1394"); av_open_input_file(&ffmpegFormatContext, filename, iformat, 0, &formatParams); 

To be continued…

If I come across additional interesting information about libavformat / libavcodec, I plan to publish it here. So, if you have any comments, please contact me at the address given at the top of this article.

Standard disclaimer: I assume no liability for the correct functioning of the code and techniques presented in this article.

//demo source code

// Bare-bones ffmpeg (cvs) decoding.
// It can handle any codec/format that ffmpeg can (mpeg, avi, quicktime, jpeg,
// and many more). The key function is decode().
//
//
// Author: Silviu Minut
// Date: July 2004.
//
//
// Licence: GPL
//
// Compile:
// g++ -g -Wall -o decode ffmpeg_decode.C -lavcodec_acl -lavformat_acl -lz
//
// Usage:
// ./decode video_file.mpeg
// 
//
// TODO:
//
// Proper error checking.
// Proper av cleanup.
// Seek/abort.
// Threads: consumer/producer model on a packet queue as in ffplay.c.
// Threads: consumer/producer model on an image queue as in ffplay.c.
//
//                          read
// encoded packets on disk ——-> encoded packets queue ->
//
//                          decode             display
//                         ——-> img queue ———> screen/disk.

#include <fstream>
#include <iostream>
#include <sstream>
#include <iomanip>
#include <string>
#include <cstdlib>
#include <unistd.h>
#include <cmath>
#include <cctype>
#include <sys/types.h>
#include <inttypes.h>

// Use the correct path here for avformat.
#include <ffmpeg_acl/ffmpeg/avformat.h>

using namespace std;

 

///////////////////////////////////////////////////////////////////////////////
//
// Auxiliary functions
//
///////////////////////////////////////////////////////////////////////////////

// Add more useful options, these don’t do anything right now.
void usage(char * prog)
{
    cout << endl << “\e[1;31mUsage: \e[0m"
  << prog << " [Options]” << endl
  << endl;
   
    cout << “\e[1;33mOptions: \e[0m" << endl
  << "\t-d            do something..." << endl
  << "\t-h            help" << endl
  << endl;

    cout << "\e[1;34mPurpose: \e[0m" << endl
  << endl;
    exit(-1);
}

// Save the raw rgb data to a stream (either cout, or to a file) (pgm/ppm).
void save_ppm(ostream & OUT, const uint8_t * rgb, size_t cols, size_t rows,
        int pixsize)
{
    switch(pixsize){
 case 1:
     OUT << "P5" << endl;
     break;
 case 3:
     OUT << "P6" << endl;
     break;
 default:
     cerr << "Can't handle pixel size: " << pixsize << endl;
     exit(-1);
    }

    OUT << cols << " " << rows << endl
 << 255 << endl;
    OUT.write((const char*) rgb, rows*cols*pixsize);
}

void save_ppm(const uint8_t * rgb, size_t cols, size_t rows, int pixsize,
        const char * file=NULL)
{
    if(file){
 fstream OUT(file, ios::out);
 save_ppm(OUT, rgb, cols, rows, pixsize);
 OUT.close();
    }
    else
 save_ppm(cout, rgb, cols, rows, pixsize);
}

// Turn an integer into a 0-padded string of a certain width.
string int2string(const int n, const int width, const char * extension=NULL)
{
    string name;
    ostringstream counter;

    counter << setfill('0') << setw(width) << n ;
    name=counter.str();
    if(extension!=NULL)
 name += extension;
    return name;
}

 

///////////////////////////////////////////////////////////////////////////////
//
// Decode proper.
//
///////////////////////////////////////////////////////////////////////////////

void decode(char * input_filename)
{
    int err=0, i=0, counter=0, len1=0, got_picture, video_index=-1;

    AVFormatContext * fcx=NULL;
    AVCodecContext * ccx=NULL;
    AVCodec * codec=NULL;

    AVFormatParameters params;
    AVPicture pict;
    AVPacket pkt;
    AVFrame *frame=avcodec_alloc_frame();

    // Open the input file.
    err = av_open_input_file(&fcx, input_filename, NULL, 0, &params);
    if(err<0){
 cerr << "Can't open file: " << input_filename << endl;
 exit(-1);
    }

    // Find the stream info.
    err = av_find_stream_info(fcx);

    // Find the first video stream.
    for(i=0; i<fcx->nb_streams; i++){
 ccx=&fcx->streams[i]->codec;
 if(ccx->codec_type==CODEC_TYPE_VIDEO) break;
    }
    video_index=i;

   
    // Open stream.
    // FIXME: proper closing of av before exit.
    if(video_index>=0){

 codec = avcodec_find_decoder(ccx->codec_id);
 if(codec)
     err = avcodec_open(ccx, codec);
 if(err<0) {
     cerr << “Can’t open codec.” << endl;
     exit(-1);
 }
    }
    else{
 cerr << “Video stream not found.” << endl;
 exit(-1);
    }

    // Decode proper
    while(1){

 // Read a frame/packet.
 if(av_read_frame(fcx, &pkt) < 0 ) break;

 // If it’s a video packet from our video stream…
 if(pkt.stream_index==video_index){

     // Decode the packet
     len1 = avcodec_decode_video(ccx, frame, &got_picture,
     pkt.data, pkt.size);

     if (got_picture) {

  // Allocate AVPicture the first time through.
  if(counter==0)
      avpicture_alloc(&pict, PIX_FMT_RGB24,
        ccx->width, ccx->height);

  img_convert(&pict, PIX_FMT_RGB24, (AVPicture*) frame,
       ccx->pix_fmt,
       ccx->width, ccx->height);

  // Visual effects: display image (save to disk, for now).
  string name=int2string(counter, 5, “.ppm”);
  save_ppm(pict.data[0], ccx->width, ccx->height, 3,
     name.c_str());

  counter++;  
     }

     av_free_packet(&pkt);
 }
    }

    // Clean up
    avpicture_free(&pict);
    av_free(frame);
    av_close_input_file(fcx);
}

int main(int argc, char * argv[])
{
    int opt, start=1; // start is where the actual input starts, after options.
    u_int n=0;

    // Template for processing of command prompt arguments.
    // Add command line options as needed.
    while((opt=getopt(argc, argv, “n:h”))!=-1){
 switch(opt){
 case ‘n’:
     if(isdigit(optarg[0])){
  n = atoi(optarg);
  start+=2;
     }
     else usage(argv[0]);
     break;

 case ‘h’:
 default:
     usage(argv[0]);
 }
    }

    // Don’t do anything if we don’t have a file.
    if(!(start<argc)) usage(argv[0]);

    // Register – only once.
    av_register_all();

    decode(argv[start]);
   

    return 0;
}

Using libavformat and libavcodec

The libavformat and libavcodec libraries that come with ffmpeg are a great way of accessing a large variety of video file formats. Unfortunately, there is no real documentation on using these libraries in your own programs (at least I couldn’t find any), and the example programs aren’t really very helpful either.

This situation meant that, when I used libavformat/libavcodec on a recent project, it took quite a lot of experimentation to find out how to use them. Here’s what I learned – hopefully I’ll be able to save others from having to go through the same trial-and-error process. There’s also a small demo program that you can download. The code I’ll present works with libavformat/libavcodec as included in version 0.4.8 of ffmpeg (the most recent version as I’m writing this). If you find that later versions break the code, please let me know.

In this document, I’ll only cover how to read video streams from a file; audio streams work pretty much the same way, but I haven’t actually used them, so I can’t present any example code.

In case you’re wondering why there are two libraries, libavformat and libavcodec: Many video file formats (AVI being a prime example) don’t actually specify which codec(s) should be used to encode audio and video data; they merely define how an audio and a video stream (or, potentially, several audio/video streams) should be combined into a single file. This is why sometimes, when you open an AVI file, you get only sound, but no picture – because the right video codec isn’t installed on your system. Thus, libavformat deals with parsing video files and separating the streams contained in them, and libavcodec deals with decoding raw audio and video streams.

Opening a Video File

First things first – let’s look at how to open a video file and get at the streams contained in it. The first thing we need to do is to initialize libavformat/libavcodec:

av_register_all();

This registers all available file formats and codecs with the library so they will be used automatically when a file with the corresponding format/codec is opened. Note that you only need to call av_register_all() once, so it’s probably best to do this somewhere in your startup code. If you like, it’s possible to register only certain individual file formats and codecs, but there’s usually no reason why you would have to do that.

Next off, opening the file:

AVFormatContext *pFormatCtx;
const char *filename=”myvideo.mpg”;
// Open video file
if(av_open_input_file(&pFormatCtx, filename, NULL, 0, NULL)!=0)
    handle_error(); // Couldn’t open file

The last three parameters specify the file format, buffer size and format parameters; by simply specifying NULL or 0 we ask libavformat to auto-detect the format and use a default buffer size. Replace handle_error() with appropriate error handling code for your application.

Next, we need to retrieve information about the streams contained in the file:

// Retrieve stream information
if(av_find_stream_info(pFormatCtx)<0)
    handle_error(); // Couldn’t find stream information

This fills the streams field of the AVFormatContext with valid information. As a debugging aid, we’ll dump this information onto standard error, but of course you don’t have to do this in a production application:

dump_format(pFormatCtx, 0, filename, false);

As mentioned in the introduction, we’ll handle only video streams, not audio streams. To make things nice and easy, we simply use the first video stream we find:

int i, videoStream;
AVCodecContext *pCodecCtx;
// Find the first video stream
videoStream=-1;
for(i=0; i<pFormatCtx->nb_streams; i++){
    if(pFormatCtx->streams[i]->codec.codec_type==CODEC_TYPE_VIDEO){
        videoStream=i;
        break;
    }
}
if(videoStream==-1)
    handle_error(); // Didn’t find a video stream
// Get a pointer to the codec context for the video stream pCodecCtx=&pFormatCtx->streams[videoStream]->codec;

OK, so now we’ve got a pointer to the so-called codec context for our video stream, but we still have to find the actual codec and open it:

AVCodec *pCodec;
// Find the decoder for the video stream
pCodec=avcodec_find_decoder(pCodecCtx->codec_id);
if(pCodec==NULL)
    handle_error(); // Codec not found
// Inform the codec that we can handle truncated bitstreams — i.e.,
// bitstreams where frame boundaries can fall in the middle of packets
if(pCodec->capabilities & CODEC_CAP_TRUNCATED)
    pCodecCtx->flags|=CODEC_FLAG_TRUNCATED;
// Open codec
if(avcodec_open(pCodecCtx, pCodec)<0)
    handle_error(); // Could not open codec

(So what’s up with those “truncated bitstreams”? Well, as we’ll see in a moment, the data in a video stream is split up into packets. Since the amount of data per video frame can vary, the boundary between two video frames need not coincide with a packet boundary. Here, we’re telling the codec that we can handle this situation.)

One important piece of information that is stored in the AVCodecContext structure is the frame rate of the video. To allow for non-integer frame rates (like NTSC’s 29.97 fps), the rate is stored as a fraction, with the numerator in pCodecCtx->frame_rate and the denominator in pCodecCtx->frame_rate_base. While testing the library with different video files, I noticed that some codecs (notably ASF) seem to fill these fields incorrectly (frame_rate_base contains 1 instead of 1000). The following hack fixes this:

// Hack to correct wrong frame rates that seem to be generated by some
// codecs
if(pCodecCtx->frame_rate>1000 && pCodecCtx->frame_rate_base==1)
    pCodecCtx->frame_rate_base=1000;

Note that it shouldn’t be a problem to leave this fix in place even if the bug is corrected some day – it’s unlikely that a video would have a frame rate of more than 1000 fps.

One more thing left to do: Allocate a video frame to store the decoded images in:

AVFrame *pFrame; pFrame=avcodec_alloc_frame();

That’s it! Now let’s start decoding some video.
Decoding Video Frames

As I’ve already mentioned, a video file can contain several audio and video streams, and each of those streams is split up into packets of a particular size. Our job is to read these packets one by one using libavformat, filter out all those that aren’t part of the video stream we’re interested in, and hand them on to libavcodec for decoding. In doing this, we’ll have to take care of the fact that the boundary between two frames can occur in the middle of a packet.

Sound complicated? Lucikly, we can encapsulate this whole process in a routine that simply returns the next video frame:

bool GetNextFrame(AVFormatContext *pFormatCtx, AVCodecContext *pCodecCtx, int videoStream, AVFrame *pFrame)
{
    static AVPacket packet;
    static int bytesRemaining=0;
    static uint8_t *rawData;
    static bool fFirstTime=true;
    int bytesDecoded;
    int frameFinished;

    // First time we’re called, set packet.data to NULL to indicate it
    // doesn’t have to be freed
    if(fFirstTime) {
        fFirstTime=false;
        packet.data=NULL
    }
    // Decode packets until we have decoded a complete frame
    while(true) {
        // Work on the current packet until we have decoded all of it
        while(bytesRemaining > 0) {
            // Decode the next chunk of data
            bytesDecoded=avcodec_decode_video(pCodecCtx, pFrame, &frameFinished, rawData, bytesRemaining); 
            // Was there an error?
            if(bytesDecoded < 0) {
                fprintf(stderr, “Error while decoding frame\n”);
                return false;
            }
            bytesRemaining-=bytesDecoded;
            rawData+=bytesDecoded;
            // Did we finish the current frame? Then we can return
            if(frameFinished)
                return true;
            }
            // Read the next packet, skipping all packets that aren’t for this
            // stream
            do {
                // Free old packet
                if(packet.data!=NULL)
                    av_free_packet(&packet);
                // Read new packet
                if(av_read_packet(pFormatCtx, &packet)<0)
                    goto loop_exit;
            } while(packet.stream_index!=videoStream);
            bytesRemaining=packet.size;
            rawData=packet.data;
        }
loop_exit:
    // Decode the rest of the last frame
    bytesDecoded=avcodec_decode_video(pCodecCtx, pFrame, &frameFinished, rawData, bytesRemaining);
    // Free last packet
    if(packet.data!=NULL)
        av_free_packet(&packet);
    return frameFinished!=0;
}

Now, all we have to do is sit in a loop, calling GetNextFrame() until it returns false. Just one more thing to take care of: Most codecs return images in YUV 420 format (one luminance and two chrominance channels, with the chrominance channels samples at half the spatial resolution of the luminance channel). Depending on what you want to do with the video data, you may want to convert this to RGB. (Note, though, that this is not necessary if all you want to do is display the video data; take a look at the X11 Xvideo extension, which does YUV-to-RGB and scaling in hardware.) Fortunately, libavcodec provides a conversion routine called img_convert, which does conversion between YUV and RGB as well as a variety of other image formats. The loop that decodes the video thus becomes:

while(GetNextFrame(pFormatCtx, pCodecCtx, videoStream, pFrame)) {
    img_convert((AVPicture *)pFrameRGB, PIX_FMT_RGB24, (AVPicture*)pFrame,
        pCodecCtx->pix_fmt, pCodecCtx->width, pCodecCtx->height);
    // Process the video frame (save to disk etc.)
    DoSomethingWithTheImage(pFrameRGB);
}

The RGB image pFrameRGB (of type AVFrame *) is allocated like this:

AVFrame *pFrameRGB;
int numBytes; uint8_t *buffer;
// Allocate an AVFrame structure
pFrameRGB=avcodec_alloc_frame();
if(pFrameRGB==NULL)
    handle_error();
// Determine required buffer size and allocate buffer
numBytes=avpicture_get_size(PIX_FMT_RGB24, pCodecCtx->width, pCodecCtx->height);
buffer=new uint8_t[numBytes];
// Assign appropriate parts of buffer to image planes in pFrameRGB
avpicture_fill((AVPicture *)pFrameRGB, buffer, PIX_FMT_RGB24, pCodecCtx->width, pCodecCtx->height);

Cleaning up
OK, we’ve read and processed our video, now all that’s left for us to do is clean up after ourselves:

// Free the RGB image
delete [] buffer;
av_free(pFrameRGB);
// Free the YUV frame
av_free(pFrame);
// Close the codec
avcodec_close(pCodecCtx);
// Close the video file
av_close_input_file(pFormatCtx);

Done!
Sample Code

A sample app that wraps all of this code up in compilable form is here. If you have any additional comments, please contact me at boehme@inb.uni-luebeckREMOVETHIS.de. Standard disclaimer: I assume no liability for the correct functioning of the code and techniques presented in this article.

//——————————————————
demo source code:
// avcodec_sample.cpp

// A small sample program that shows how to use libavformat and libavcodec to
// read video from a file.
//
// Use
//
// g++ -o avcodec_sample avcodec_sample.cpp -lavformat -lavcodec -lz
//
// to build (assuming libavformat and libavcodec are correctly installed on
// your system).
//
// Run using
//
// avcodec_sample myvideofile.mpg
//
// to write the first five frames from “myvideofile.mpg” to disk in PPM
// format.

#include <avcodec.h>
#include <avformat.h>

#include <stdio.h>

bool GetNextFrame(AVFormatContext *pFormatCtx, AVCodecContext *pCodecCtx,
    int videoStream, AVFrame *pFrame)
{
    static AVPacket packet;
    static int      bytesRemaining=0;
    static uint8_t  *rawData;
    static bool     fFirstTime=true;
    int             bytesDecoded;
    int             frameFinished;

    // First time we’re called, set packet.data to NULL to indicate it
    // doesn’t have to be freed
    if(fFirstTime)
    {
        fFirstTime=false;
        packet.data=NULL;
    }

    // Decode packets until we have decoded a complete frame
    while(true)
    {
        // Work on the current packet until we have decoded all of it
        while(bytesRemaining > 0)
        {
            // Decode the next chunk of data
            bytesDecoded=avcodec_decode_video(pCodecCtx, pFrame,
                &frameFinished, rawData, bytesRemaining);

            // Was there an error?
            if(bytesDecoded < 0)
            {
                fprintf(stderr, “Error while decoding frame\n”);
                return false;
            }

            bytesRemaining-=bytesDecoded;
            rawData+=bytesDecoded;

            // Did we finish the current frame? Then we can return
            if(frameFinished)
                return true;
        }

        // Read the next packet, skipping all packets that aren’t for this
        // stream
        do
        {
            // Free old packet
            if(packet.data!=NULL)
                av_free_packet(&packet);

            // Read new packet
            if(av_read_packet(pFormatCtx, &packet)<0)
                goto loop_exit;
        } while(packet.stream_index!=videoStream);

        bytesRemaining=packet.size;
        rawData=packet.data;
    }

loop_exit:

    // Decode the rest of the last frame
    bytesDecoded=avcodec_decode_video(pCodecCtx, pFrame, &frameFinished,
        rawData, bytesRemaining);

    // Free last packet
    if(packet.data!=NULL)
        av_free_packet(&packet);

    return frameFinished!=0;
}

void SaveFrame(AVFrame *pFrame, int width, int height, int iFrame)
{
    FILE *pFile;
    char szFilename[32];
    int  y;

    // Open file
    sprintf(szFilename, “frame%d.ppm”, iFrame);
    pFile=fopen(szFilename, “wb”);
    if(pFile==NULL)
        return;

    // Write header
    fprintf(pFile, “P6 %d %d 255\n”, width, height);

    // Write pixel data
    for(y=0; y<height; y++)
        fwrite(pFrame->data[0]+y*pFrame->linesize[0], 1, width*3, pFile);

    // Close file
    fclose(pFile);
}

int main(int argc, char *argv[])
{
    AVFormatContext *pFormatCtx;
    int             i, videoStream;
    AVCodecContext  *pCodecCtx;
    AVCodec         *pCodec;
    AVFrame         *pFrame;
    AVFrame         *pFrameRGB;
    int             numBytes;
    uint8_t         *buffer;

    // Register all formats and codecs
    av_register_all();

    // Open video file
    if(av_open_input_file(&pFormatCtx, argv[1], NULL, 0, NULL)!=0)
        return -1; // Couldn’t open file

    // Retrieve stream information
    if(av_find_stream_info(pFormatCtx)<0)
        return -1; // Couldn’t find stream information

    // Dump information about file onto standard error
    dump_format(pFormatCtx, 0, argv[1], false);

    // Find the first video stream
    videoStream=-1;
    for(i=0; i<pFormatCtx->nb_streams; i++)
        if(pFormatCtx->streams[i]->codec.codec_type==CODEC_TYPE_VIDEO)
        {
            videoStream=i;
            break;
        }
    if(videoStream==-1)
        return -1; // Didn’t find a video stream

    // Get a pointer to the codec context for the video stream
    pCodecCtx=&pFormatCtx->streams[videoStream]->codec;

    // Find the decoder for the video stream
    pCodec=avcodec_find_decoder(pCodecCtx->codec_id);
    if(pCodec==NULL)
        return -1; // Codec not found

    // Inform the codec that we can handle truncated bitstreams — i.e.,
    // bitstreams where frame boundaries can fall in the middle of packets
    if(pCodec->capabilities & CODEC_CAP_TRUNCATED)
        pCodecCtx->flags|=CODEC_FLAG_TRUNCATED;

    // Open codec
    if(avcodec_open(pCodecCtx, pCodec)<0)
        return -1; // Could not open codec

    // Hack to correct wrong frame rates that seem to be generated by some
    // codecs
    if(pCodecCtx->frame_rate>1000 && pCodecCtx->frame_rate_base==1)
        pCodecCtx->frame_rate_base=1000;

    // Allocate video frame
    pFrame=avcodec_alloc_frame();

    // Allocate an AVFrame structure
    pFrameRGB=avcodec_alloc_frame();
    if(pFrameRGB==NULL)
        return -1;

    // Determine required buffer size and allocate buffer
    numBytes=avpicture_get_size(PIX_FMT_RGB24, pCodecCtx->width,
        pCodecCtx->height);
    buffer=new uint8_t[numBytes];

    // Assign appropriate parts of buffer to image planes in pFrameRGB
    avpicture_fill((AVPicture *)pFrameRGB, buffer, PIX_FMT_RGB24,
        pCodecCtx->width, pCodecCtx->height);

    // Read frames and save first five frames to disk
    i=0;
    while(GetNextFrame(pFormatCtx, pCodecCtx, videoStream, pFrame))
    {
        img_convert((AVPicture *)pFrameRGB, PIX_FMT_RGB24, (AVPicture*)pFrame,
            pCodecCtx->pix_fmt, pCodecCtx->width, pCodecCtx->height);

        // Save the frame to disk
        if(++i<=5)
            SaveFrame(pFrameRGB, pCodecCtx->width, pCodecCtx->height, i);
    }

    // Free the RGB image
    delete [] buffer;
    av_free(pFrameRGB);

    // Free the YUV frame
    av_free(pFrame);

    // Close the codec
    avcodec_close(pCodecCtx);

    // Close the video file
    av_close_input_file(pFormatCtx);

    return 0;
}

2004年11月06日

采用分层编码方式的软件编解码器在日本问世

 
2004/10/28

spacer
spacer
  【日经BP社报道】日本NTT CyberSpace研究所日前开发成功了面向内容发送和视讯会议的MPEG-4软件编解码器。能以4Mbit/秒数据编码速度对30帧/秒的VGA影像进行编码。如果使用工作频率为3GHz左右的PC服务器,即便应用于视讯会议系统,也能不掉帧地对影像进行编码(图1)。


图1

  NTT CyberSpace研究所此次开发的软件编解码器采用了分层编码(Hierarchical Encoding)方式,有2个特点。(1)对于以一种数据编码速度编码的影像数据流,能以不超过该速度范围的任意数据编码速度对影像数据流进行解码;(2)能够只对影像数据流中的任意区域增大数据编码速度(图2)。

  之所以能够实现上述2个特点,主要是因为在编码顺序做了研究。具体而言,就是分成了基本层和扩展层(图3)。基本层利用MPEG-4简洁规范(MPEG-4 Simple Profile)或者MPEG-4高级简洁规范(MPEG-4 Advanced Simple Profile)进行编码。数据编码速度为100Kbit/秒,图像尺寸为QVGA。不过,如果将数据编码速度降到100Kbit/秒,画质自然也会下降,因此马赛克噪音很明显。对画质下降进行补偿的是扩展层数据。采用MPEG-4 FGS(Fine Granular Scalability)进行编码。具体而言,对VGA影像进行编码后,以图像尺寸转换QVGA影像。此时利用扩展层对遗漏的信息进行编码。

  在视讯会议系统中使用时,只发送基本层。而后根据通信线路的数据传输速度,选择扩展层的信息进行发送。由于扩展层包括对基本层进行解码的影像与原图像的差异信息,因此解码端只要更多地接收差异信息,就能够根据信息量对数据编码速度较高的影像进行解码。由此就能根据数据传输速度任意改变数据编码速度。不过,由于是分成基本层和扩展层进行编码的,因此从编码效率上来讲产生了2成左右的损失。“如何减少这部分损失非常重要。目前的损失和普通分层编码技术基本上差不多。今后准备通过改进减少损失。”(NTT CyberSpace研究所 媒体通信项目 影像编码技术部 研究小组负责人 主任研究员八岛 由幸)

  此外,通过仅限定于某个区域来得到差异信息,而不是帧内的全部区域均等地得到差异信息,实现了第(2)个特点(图4)。只要在解码端指定想要提高数据编码速度的区域,就会将4角的坐标数据发送给服务器。收到坐标数据的服务器向解码端只发送指定区域内的差异数据。由此就能只对某个限定的区域提高数据编码速度。有了这项功能,哪怕使用相同数据传输速度的带宽,也能播放画质不同于同一影像数据流的影像。

  另外,尽管目前基本层利用MPEG-4简洁规范或者MPEG-4高级简洁规范进行编码,但NTT CyberSpace研究所今后准备改用H.264/MPEG-4 AVC规格。“因为相同的数据编码速度,H.264能够实现图像噪音更少的影像。”(八岛 由幸)。NTT在2004年10月25日于美国举办的“2004年美国无线通讯技术展(CTIA WIRELESS I.T. & Entertainment 2004)”上,展出了采用NTT CyberSpace研究所开发的软件编解码器的视讯会议系统。(记者:伊藤 大贵)


图2




图3




ARM公司日前发布了新的媒体和信号处理NEON技术,将加速多种应用。ARM NEON技术适用于手机和消费娱乐电子,可灵活地实现多种视频编/解码、三维图像、语音处理、音频解码、图像处理和基带功能。NEON技术将应用在将来的ARM处理器中,该技术也将获得ARM和第三方工具提供商的广泛支持。
    
    NEON技术是64/128位单指令多数据流(SIMD)指令集,用于新一代媒体和信号处理应用加速。NEON技术下执行MP3音频解码器,CPU频率可低于10兆赫;运行GSM AMR语音数字编解码器,CPU频率仅为13兆赫。新技术包含指令集、独立的寄存器及可独立的执行硬件。NEON支持8位、16位、32位、64位整数及单精度浮点SIMD操作,以进行音频/视频、图像和游戏处理。
    
    ARM公司市场副总裁Mike Inglis表示:“从消费者到芯片设计者所有人都将享受到NEON技术的优势。新技术将为消费电子提供台式机质量的音频、视频和三维图像,可依据业界标准的变换进行重编程。NEON技术为系统开发者提供前所未有的灵活性和高性能,满足功耗和面积都非常有限的新一代手机和消费电子产品的需求。”
    
    Semico Research微逻辑首席技术官Tony Massimini表示:“我们期待ARM提供上述先进技术。NEON技术进一步表明ARM公司继续在ARM内核中开发新技术,以满足手机、数字消费电子市场不断增长的需求。”
    
    NEON技术的指令集可与C编译器连接,ARM及第三方公司将提供相关的C编译器。该项新技术支持由Khronos集团定义的OpenMAX应用程序接口,提高了软件的可移植性、复用性,并缩短产品上市时间。
    
    NEON技术支持ARM OptimoDE数据引擎,使用固定的指令集,适于多种应用以普通处理器运行软件。OptimoDE数据引擎执行高度可配置VLIW指令,为特定的应用提供优异的性能。使用NEON和OptimoDE技术,新一代手机等多种应用将实现卓越的基带和应用性能。
    
    NEON技术将应用在未来的ARM系列处理器中。为了满足ARM合作伙伴对于应用方面的特殊要求,ARM已开发了适于多种市场的各种价位、各种性能的内核技术:用于数据安全性的ARM TrustZone技术;用以Java加速的Jazelle技术;用于功耗管理的智能功耗管理器(Intelligent Energy Manager);新的NEON技术;用于媒体和信号处理的OptimoDE数据引擎。