1 Summary

Recently, data mining and machine learning techniques have been widely applied in multimedia information retrieval and recommendation. Moreover, there has been research to bridge semantic gap issues of multimedia retrieval and recommendation. Data mining and machine learning have permeated nearly every area of multimedia content analysis, multimedia retrieval and recommendation, the vast sources of multimedia in audio, image, text and video formats and advances in semantic annotation of large multimedia collections. This special issue presents high quality contributions addressing data mining and machine learning techniques with applications to multimedia retrieval and recommendation and addresses some practical issues, such as robustness to noise and scalability.

We have finally selected 11 manuscripts for this special issue after the first, second review processes. Each manuscript selected was blindly reviewed by at least three reviewers consisting of guest editors and external reviewers.

The paper entitled “Multivoxel Analysis for Functional Magnetic Resonance Imaging (fMRI) Based on Time-Series and Contextual Information: Relationship Between Maternal Love and Brain Regions as a Case Study” (10.1007/s11042-014-2020-4), by Bo-Wei Chen, Yang-Yen Ou, Chun-Chia Kung, Ding-Ruey Yeh, Seungmin Rho, and Jhing-Fa Wang used discriminant analysis integrated with machine learning for finding active brain areas. To enhance the performance, they also developed a novel feature, blood-oxygen-level-dependent (BOLD) contrast edge, which can model time-series changes of voxels in the brain medical images. BOLD contrast edge considers the actual blood activation when brain cells respond to fMRI. The experimental results showed that the proposed feature combined with discriminant analysis could achieve an accuracy rate of as high as 83.33 %. Compared with classic approaches, their system demonstrated effectiveness.

The paper entitled “An Efficient Face Detection based on Color-filtering and its Application to Smart Devices” (10.1007/s11042-013-1786-0), by Yeong Nam Chae, Taewoo Han, Yong-Ho Seo, and Hyun S. Yang proposes a color filtering-based face detection method with efficient region scanning which can enhance conventional face detectors which use the sliding-window approach and appearance information from a gray image. They adopted a kernel-based object tracker for lower computational cost. They also implemented real-time face detection and tracking application in the iPhone environment by integrating the proposed face detector and kernel based object tracker.

In another paper entitled “Updating the High-Utility Pattern Trees with Transaction Modification” (10.1007/s11042-014-2178-9), Chun-Wei Lin, Binbin Zhang, Wensheng Gan, Bo-Wei Chen, Seungmin Rho, and Tzung-Pei Hong have developed a maintenance Fast Updated High Utility Pattern tree for transaction MODification (FUP-HUP-tree-MOD) algorithm for maintaining and updating the discovered high-utility itemsets (HUIs) with transaction modification. A built HUP tree can be used as an effective structure to reduce the multiple database scan by keeping the necessary information for later maintenance and mining process. Based on the developed FUP-HUP-tree-MOD algorithm with the built HUP tree, the proposed algorithm is unnecessary to rescan the original database each time especially when the number of modified transactions is quiet small. Since the traditional two-phase and HUPtree-Batch algorithms are performed in batch mode, experiments showed that the proposed FUP-HUP-tree-MOD algorithm requires fewer execution time to complete find HUIs compared to the two-phase and HUPtree-Batch algorithms. Besides, only a slighter number of tree nodes are generated compared to the batch-mode HUPtree-Batch algorithm.

The next paper entitled “Incorporating Frequent Pattern Analysis into Multimodal HMM Event Classification for Baseball Videos” (10.1007/s11042-015-2447-2), by Hsuan-Sheng Chen and Wen-Jiin Tsai developed a HMM-based event classification system integrating frequent pattern analysis by representing a video using a temporal database of multimodal interval features, different symbol coding methods could be chosen to encode multimodal features as a sequence of symbols.

The paper entitled “Modelling Multilevel Data in Multimedia: A Hierarchical Factor Analysis Approach” (10.1007/s11042-014-2394-3), by Sunil Gupta, Dinh Phung, and Svetha Venkatesh proposes a framework that can discover low-dimensional structures for a primary data source together with other associated information. They also present a generalized hierarchical factor analysis for modeling multimedia data with low-level and high-level features. Their model act as a hierarchical subspace learning by grouping the primary medium in a latent subspace while incorporating the associated media to improve the grouping.

Next paper entitled “Improvement of collaborative filtering using Rating Normalization” (10.1007/s11042-013-1814-0), by Soo-Cheol Kim, Kyoung-Jun Sung, Chan-Soo Park, and Sung Kwon Kim proposes a method for the improvement of an existing preference prediction algorithm to increase the accuracy of recommendation systems. Their proposed method consists of two processes of perceiving users’ rating dispositions with clustering and of performing rating normalization according to such rating dispositions in order to do more precise preference prediction by perceiving differences in users’ rating dispositions.

In another paper entitled “Efficient Foreground Extraction using RGB-D Imaging” (10.1007/s11042-013-1789-x), by Sang-Woo Lee, Yong-Ho Seo and Hyun S. Yang presents a foreground extraction algorithm, which utilizes depth information from RGB-D sensors such as Microsoft Kinect and offers users guidance in the foreground extraction process. The proposed algorithm can be applied as pre-processing for slow segmentation algorithms in order to reduce target regions and search space, and guide users to facilitate less interaction with them in the case of interactive segmentation methods.

Next paper entitled “Unsupervised noise reduction scheme for voice-based information retrieval in mobile environments” (10.1007/s11042-013-1788-y), by Jeong-Sik Park, Gil-Jin Jang, Ji-Hwan Kim, and Sang-Soo Yeo proposes noise reduction scheme that employs the adaptive comb filtering technique by modifying the conventional comb filter using line spectral pair parameters. They conducted speech recognition experiments using the Aurora2 database to verify the efficiency of the proposed noise reduction approach.

In another paper entitled “Detection and Localization of Illegal Electricity Usage in Power Distribution Line” (10.1007/s11042-014-2022-2), by Mandakh Oyun-Erdene, Bat-Erdene Byambasuren, Eric T. Matson, and Donghan Kim proposes an inspection robot for detecting and localizing of illegal electricity usage. The inspection robot can define location of illegal electricity usage on the air transmission line without disconnecting the end user’s electric connection.

Next paper entitled “A Locality-Aware Resource Management Scheme for the Hierarchical P2P System” (10.1007/s11042-014-1854-0), by Chung-Pyo Hong proposes a mobile locality-based hierarchical P2P overlay network (MLH-Net), which can utilize mobility features in a mobile environment, to address locality problems without any other services. The MLH-Net is constructed as two layers, an upper layer formed with super-nodes and a lower layer formed with normal-nodes. Their proposed method can guarantee physical locality utilization between a requestor and a target during any discovery process.

The last paper entitled “Implementation of a Large-scale Language Model Adaptation in a Cloud Environment” (10.1007/s11042-013-1787-z), by Kwang-Ho Kim, Dae-Young Jung, Donghyun Lee, Hyuk-Jun Lee, Sung-Yong Park, Myoung-Wan Koo, Ji-Hwan Kim, Jeong-sik Park, Hyung-Bae Jeon, and Yun-Keun Lee presents a system of large-scale trigram language model (LM) adaptation for daily generated big-size text corpus using MapReduce in a cloud environment. They implemented the model using one of the representative cloud services like Amazon EC2, and distributed processing frameworks like Hadoop, respectively. In their approach, they try to find the optimal number of Amazon EC2 instances in the LM adaptation under the time constraint that the daily-generated Twitter texts should be processed within 1 day.