Source author record

Walid Mahdi

Walid Mahdi appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

5works
3topics
4close collaborators

Actions

Connect this record

Log in to claim

Research graph

See the researcher in context

Open full explorer

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2014arXiv

Automatic video scene segmentation based on spatial-temporal clues and rhythm

With ever increasing computing power and data storage capacity, the potential for large digital video libraries is growing rapidly.However, the massive use of video for the moment is limited by its opaque characteristics. Indeed, a user who has to handle and retrieve sequentially needs too much time in order to find out segments of interest within a video. Therefore, providing an environment both convenient and efficient for video storing and retrieval, especially for content-based searching as this exists in traditional textbased database systems, has been the focus of recent and important efforts of a large research community In this paper, we propose a new automatic video scene segmentation method that explores two main video features; these are spatial-temporal relationship and rhythm of shots. The experimental evidence we obtained from a 80 minutevideo showed that our prototype provides very high accuracy for video segmentation.

preprint2013arXiv

A Visual Grammar Approach for TV Program Identification

Automatic identification of TV programs within TV streams is an important task for archive exploitation. This paper proposes a new spatial-temporal approach to identify programs in TV streams in two main steps: First, a reference catalogue for video grammars of visual jingles is constructed. We exploit visual grammars characterizing instances of the same program type in order to identify the various program types in the TV stream. The role of video grammar is to represent the visual invariants for each visual jingle using a set of descriptors appropriate for each TV program. Secondly, programs in TV streams are identified by examining the similarity of the video signal to the visual grammars in the catalogue. The main idea of identification process consists in comparing the visual similarity of the video signal signature in TV stream to the catalogue elements. After presenting the proposed approach, the paper overviews the encouraging experimental results on several streams extracted from different channels and composed of several programs.

preprint2013arXiv

AViTExt: Automatic Video Text Extraction, A new Approach for video content indexing Application

In this paper, we propose a spatial temporal video-text detection technique which proceed in two principal steps:potential text region detection and a filtering process. In the first step we divide dynamically each pair of consecutive video frames into sub block in order to detect change. A significant difference between homologous blocks implies the appearance of an important object which may be a text region. The temporal redundancy is then used to filter these regions and forms an effective text region. The experimentation driven on a variety of video sequences shows the effectiveness of our approach by obtaining a 89,39% as precision rate and 90,19 as recall.

preprint2013arXiv

Content-Based Video Browsing by Text Region Localization and Classification

The amount of digital video data is increasing over the world. It highlights the need for efficient algorithms that can index, retrieve and browse this data by content. This can be achieved by identifying semantic description captured automatically from video structure. Among these descriptions, text within video is considered as rich features that enable a good way for video indexing and browsing. Unlike most video text detection and extraction methods that treat video sequences as collections of still images, we propose in this paper spatiotemporal. video-text localization and identification approach which proceeds in two main steps: text region localization and text region classification. In the first step we detect the significant appearance of the new objects in a frame by a split and merge processes applied on binarized edge frame pair differences. Detected objects are, a priori, considered as text. They are then filtered according to both local contrast variation and texture criteria in order to get the effective ones. The resulted text regions are classified based on a visual grammar descriptor containing a set of semantic text class regions characterized by visual features. A visual table of content is then generated based on extracted text regions occurring within video sequence enriched by a semantic identification. The experimentation performed on a variety of video sequences shows the efficiency of our approach.

preprint2013arXiv

Lip Localization and Viseme Classification for Visual Speech Recognition

The need for an automatic lip-reading system is ever increasing. Infact, today, extraction and reliable analysis of facial movements make up an important part in many multimedia systems such as videoconference, low communication systems, lip-reading systems. In addition, visual information is imperative among people with special needs. We can imagine, for example, a dependent person ordering a machine with an easy lip movement or by a simple syllable pronunciation. Moreover, people with hearing problems compensate for their special needs by lip-reading as well as listening to the person with whome they are talking.