Search Results: 1 - 10 of 100 matches for " "
All listed articles are free for downloading (OA Articles)
Page 1 /100
Display every page Item
Research on Blog

YANG Yu-Hang,ZHAO Tie-Jun,YU Hao,ZHENG De-Quan,
,赵铁军,于 浩,郑德权

软件学报 , 2008,
Abstract: Popularity of bloggers and the amount of information in the blogosphere increase fast.Blogs have constituted a dynamic and tightly social network by using frequent links and information interaction,and become an important source of information for the real world.Most researches on blog mainly concentrate on blog definition and identification,content mining,community discovery,importance analysis,blog search and spam blog identification.Methods and technologies of link analysis and natural language processing are used in most works, and some blog-specific methods are proposed.This paper analyzes and compares these researches on blogosphere. Problems of current topics are discussed,and finally future directions are proposed in this paper.
Research on Frequent Itemsets Mining Algorithm Based on High-dimensional Sparse Dataset

YAN Zhen,PI De-chang,WU Wen-hao,

计算机科学 , 2011,
Abstract: The traditional mining algorithms arc not applicable to mine high-dimensional sparse dataset,a new frequent itemsets mining algorithm based on high-dimensional sparse dataset named FIRS (Frequent mining algorithm based on High-dimensional Sparse dataset) was proposed in this paper. FIHS adopts a new data structure to store frequcnt itemsets, using this structure can reduce the storage space and the cost of counting. FIHS can avoid generating infrectuent candidate itemsets through optimizing the operation of connection and pruning,which rectuires scan the dataset once. What's more,just by applying ANIX)R operation,frequcnt K+1-itemsets can be created according to frequent K-itemsets, and the maintenance of the data structure is simple. According to theoretical analysis and experiments, the improved algorithm enjoys many advantages aiming at high-dimensional sparse dataset, such as quick mining, less memory spacc,etc.
Data Mining using Unguided Symbolic Regression on a Blast Furnace Dataset  [PDF]
Michael Kommenda,Gabriel Kronberger,Christoph Feilmayr,Michael Affenzeller
Computer Science , 2013, DOI: 10.1007/978-3-642-27549-4_51
Abstract: In this paper a data mining approach for variable selection and knowledge extraction from datasets is presented. The approach is based on unguided symbolic regression (every variable present in the dataset is treated as the target variable in multiple regression runs) and a novel variable relevance metric for genetic programming. The relevance of each input variable is calculated and a model approximating the target variable is created. The genetic programming configurations with different target variables are executed multiple times to reduce stochastic effects and the aggregated results are displayed as a variable interaction network. This interaction network highlights important system components and implicit relations between the variables. The whole approach is tested on a blast furnace dataset, because of the complexity of the blast furnace and the many interrelations between the variables. Finally the achieved results are discussed with respect to existing knowledge about the blast furnace process.
Supervised Learning Based Data Mining Technology with Its Application to Life Insurance Dataset Analysis
Junpeng Zhang,Yuan Cui,Weihua Liu
International Journal of Business and Management , 2009,
Abstract: This paper introduces concept of data mining, and presents two methods of data mining: supervised learning and unsupervised learning.The author uses an instance of life insurance dataset to explain the process of supervised learning.The experiment show that some attributes of those who had purchased life insurances are common. The result of data mining is much significant for the insurance sales representatives to improve efficiency and focus on specific people who have greater probability to buy life insurances.
The Blog - a new paradigm for exploiting the educational resources  [PDF]
Informatica Economica Journal , 2007,
Abstract: By using blogs, the way towards a new learning environment within which the teacher and the student are connected, thus becoming more powerful, more motivated and reflective, is opened. In this way, the balance between individualized technologies and centralized ones is more stable. The use of web logs within an educational system will have a limited impact if the present dynamics of this communication system is ignored. In the light of these data, a possible resemblance with the implementation of institutionalized systems of study is underlined as an alternative form of the online studying environment.
Domain Driven Multi-Feature Combined Mining for Retail Dataset
Arti Deshpande,Dr. Anjali Mahajan
International Journal of Engineering and Advanced Technology , 2013,
Abstract: Association Mining is used to generate the patterns from static data available. But from the business perspective, usefulness and understandability of those rules are more important. Through classical association mining many redundant rules are generated which may be not useful for business analysis. The proposed framework helps in generating the combined rules which gives informative knowledge for business by combining static and transactional data. This paper gives pruning method to remove the redundant rules before generating the combined rules. Finally Rule Clusters are generated for similar group customer or similar transaction characteristics which provide more interesting knowledge and actionable result than traditional association rule. Experimental result demonstrate the proposed techniques.
Pattern Mining and Rules Generation in Multidimensional Aircraft Accidental Dataset  [cached]
Anju Singh R. C. Jain
International Journal of Electronics Communication and Computer Engineering , 2012,
Abstract: Aiming at the aircraft accident in all over the world, we collected the data from various surveys, on which we form the multidimensional association rule and its model of aircraft accident. Multidimensional association rule field of data mining is applied to discover the various causes, sources and circumstances for the accidents in aviation. Based on the above approach we can take some preventive measures to reduce the various causes of accidents in the aviation field.
A new bed elevation dataset for Greenland  [PDF]
J. A. Griggs,J. L. Bamber,R. T. W. L. , Hurkmans,J. A. Dowdesewell
The Cryosphere Discussions , 2012, DOI: 10.5194/tcd-6-4829-2012
Abstract: We present a new bed elevation dataset for Greenland derived from a combination of multiple airborne ice thickness surveys undertaken between the 1970s and 2011. Around 344 000 line kilometres of airborne data were used, with the majority of this having been collected since the year 2000, when the last comprehensive compilation was undertaken. The airborne data were combined with satellite-derived elevations for non glaciated terrain to produce a consistent bed digital elevation model (DEM) over the entire island including across the glaciated/ice free boundary. The DEM was extended to the continental margin with the aid of bathymetric data, primarily from a compilation for the Arctic. Ice shelf thickness was determined where a floating tongue exists, in particular in the north. The across-track spacing between flight lines warranted interpolation at 1 km postings near the ice sheet margin and 2.5 km in the interior. Grids of ice surface elevation, error estimates for the DEM, ice thickness and data sampling density were also produced alongside a mask of land/ocean/grounded ice/floating ice. Errors in bed elevation range from a minimum of ±6 m to about ±200 m, as a function of distance from an observation and local topographic variability. A comparison with the compilation published in 2001 highlights the improvement in resolution afforded by the new data sets, particularly along the ice sheet margin, where ice velocity is highest and changes most marked. We use the new bed and surface DEMs to calculate the hydraulic potential for subglacial flow and present the large scale pattern of water routing. We estimate that the volume of ice included in our land/ice mask would raise eustatic sea level by 7.36 m, excluding any solid earth effects that would take place during ice sheet decay.
Knowledge Discovery Process: Guide Lines for New Researchers  [PDF]
L. Al-Shalabi
Journal of Artificial Intelligence , 2011,
Abstract: Guide lines for new researchers who are interested in the field of knowledge discovery and especially the data mining. Three engines were described; the preprocessing engine which describes the preparation of data in a dataset that will be called later as training dataset, the processing engine which is the data mining engine that describes the process of training the dataset (the training dataset) and the after-processing engine which describes and represents the new knowledge as a knowledge discovery or a data mining model. The challenge is how to prepare the data for data mining. The data should be of high quality so it will help in getting a high accuracy for a data mining system.
Deposit subscribe Prediction using Data Mining Techniques based Real Marketing Dataset  [PDF]
Safia Abbas
Computer Science , 2015, DOI: 10.5120/19293-0725
Abstract: Recently, economic depression, which scoured all over the world, affects business organizations and banking sectors. Such economic pose causes a severe attrition for banks and customer retention becomes impossible. Accordingly, marketing managers are in need to increase marketing campaigns, whereas organizations evade both expenses and business expansion. In order to solve such riddle, data mining techniques is used as an uttermost factor in data analysis, data summarizations, hidden pattern discovery, and data interpretation. In this paper, rough set theory and decision tree mining techniques have been implemented, using a real marketing data obtained from Portuguese marketing campaign related to bank deposit subscription [Moro et al., 2011]. The paper aims to improve the efficiency of the marketing campaigns and helping the decision makers by reducing the number of features, that describes the dataset and spotting on the most significant ones, and predict the deposit customer retention criteria based on potential predictive rules.
Page 1 /100
Display every page Item

Copyright © 2008-2017 Open Access Library. All rights reserved.