Summary of Paper
Storage and Querying of E-Commerce Data
Muhammad Yahya
Computer and Information System Engineering Department
muhammad.yahya@gmail.com
1 Abstract
New generation of e-commerce applications require data schemas that are constantly evolving and sparsely populated. The conventional horizontal row representation fails to meet these requirements. The Horizontal Data can be transformed in to Vertical representation, in which each row has Object Identifier and attribute name-value pair.
2 Introduction
In Large e-commerce systems, attributes in some table keeps on increasing with new items. And so the performance was becoming bottle-neck as most of the columns had no values. This problem was also giving rise to change the schema again and again to accommodate new commodities.
2.1 Issues
The following problems were faced in horizontal representation
· Large number of columns: As many columns required as the attributes of objects, and this count keeps on increasing with new objects
· Sparsity: Nulls in most of the fields
· Schema Evolution: Frequent alteration would be required in the design
· Performance: Only few columns are required from wide records
2.2 Vertical Representation
As a solution of above problem, Vertical representation was proposed to be used in large e-commerce systems. And the table will have the following columns in it.
Oid | Key | Val |
Oid: Object Identifier
Key: Attribute Name
Val: Attribute Name
The vertical table contains tuples for only those attributes that are present in an object. Writing SQL queries against this vertical representation was difficult and error-prone. Also most of the tools written are for Horizontal representation.
Horizontal
- corresponds to null value | Vertical
|
2.3 Alternative Representations
Following were the alternatives present
· Split a horizontal table into as many 2-ary tables as the number of columns i.e. as many tables as the number of columns
· Create one table for each new category
· Create one table for common attributes and per category separate tables for non-common attributes.
· Represent you data in 3-ary table i.e. the Vertical Representation
3 Transformations
The view to user is Horizontal, and all the queries made by user are against that view, behind this logical view is the vertical representation. This vertical view is transparent to the user.
4 Implementation
A non-intrusive enablement layer is built on top of the database engine.
· Vertical SQL
· Vertical UDF
· Schema SQL
5 Performance Experiments
Very Large numbers of experiments were performed on different alternatives, and results for projection, selection, join and aggregation operations are given in [1], and the results are shown with the help of graphs.
6 Conclusion
Emerging applications such as e-commerce and portals are creating new threats and opportunities for database technology.
| Horizontal | Vertical | Binary |
Manageability | + | + | - |
Flexibility | - | + | - |
Performance | - | + | + |
The following enhancements[1] can improve the performance of Vertical representation.
· Enhanced table functions
· First class treatment of table functions
· Native support for v2h and h2v operations
7 References
[1] Storage and Querying of E-Commerce Data
Rakesh Agrawal Amit Somani Yirong Xu
http://www.almaden.ibm.com/software/quest/Publications/papers/vldb01_ecom.pdf
[1] See [1] for description of these capabilities
Looks cool! I might read it someday! :)
ReplyDeleteI'm sure that someday will never come, so just say the summary looks good ... whiz!
ReplyDeletebtw, how much did ya get outta 10 for the summary yahya man?