%0 Journal Article %T Transductive Multi-Modality Video Semantic Concept Detection with Tensor Representation
基于张量表示的直推式多模态视频语义概念检测 %A WU Fei %A LIU Ya-Nan %A ZHUANG Yue-Ting %A
吴飞 %A 刘亚楠 %A 庄越挺 %J 软件学报 %D 2008 %I %X A higher-order tensor framework for video analysis and understanding is proposed in this paper.In this framework,image frame,audio and text are represented,which are the three modalities in video shots as data points by the 3rd-order tensor.Then a subspace embedding and dimension reduction method is proposed,which explicitly considers the manifold structure of the tensor space from temporal-sequenced associated co-occurring multimodal media data in video.It is called TensorShot approach.Transductive learning uses a large amount of unlabeled data together with the labeled data to build better classifiers.A transductive support tensor machines algorithm is proposed to train effective classifier.This algorithm preserves the intrinsic structure of the submanifold where tensorshots are sampled,and is also able to map out-of-sample data points directly.Moreover,the utilization of unlabeled data improves classification ability.Experimental results show that this method improves the performance of video semantic concept detection. %K multi-modality %K TensorShot %K temporal associated cooccurrence (TAC) %K higher order SVD (HOSVD) %K dimensionality reduction %K transductive support tensor machine (TSTM)
多模态 %K 张量镜头 %K 时序关联共生 %K 高阶SVD %K 降维 %K 直推式支持张量机 %U http://www.alljournals.cn/get_abstract_url.aspx?pcid=5B3AB970F71A803DEACDC0559115BFCF0A068CD97DD29835&cid=8240383F08CE46C8B05036380D75B607&jid=7735F413D429542E610B3D6AC0D5EC59&aid=302CC29B16C1420C5EF63A5FB7613E7E&yid=67289AFF6305E306&vid=2A8D03AD8076A2E3&iid=708DD6B15D2464E8&sid=720E49F6DB248E5D&eid=45B945AE36F6880C&journal_id=1000-9825&journal_name=软件学报&referenced_num=3&reference_num=33