Keywords
|
Mobile Ad hoc Networks (MANETs), Data Replication, Data Availability, Query Delay. |
INTRODUCTION
|
A mobile ad hoc network (MANET) is a fast growing area of research. It is a network functioning without any infrastructure and consists of mobile nodes that serve both as host and routers to forward packets on behalf of each other [1, 2]. MANET differs significantly from wired networks in terms of network topology, network configuration and network resources. Features of MANETs are dynamic topology due to node mobility, partition of network due to unreliable link and limited resources such as limited battery and limited memory capacity [1, 2]. Data sharing is one of the important functionality to be supported in MANETs. Without this facility, the advantage of MANET is greatly reduced [3].The best example where data sharing is important in conference where several users share their presentations on discussing a particular issue, and it is also applicable in military application, rescue operation, disaster management etc. The approach used for data sharing heavily depends upon the features of the MANET [3]. The frequent network partition due to node mobility or limited battery power reduces the data availability in the network. To improve data availability, replication algorithm deals all these issues such that data is available at all times. |
In this paper, introduction section is followed by data replication in section II. Section III discusses the issues that complicate data replication. An overview of existing works is explained in section IV which is continued by proposed solution and conclusion in section V and VI. |
DATA REPLICATION
|
Data Replication is a technique which enhances the data availability by making copies of data items. Replication allows better data sharing. It is a key approach for achieving high availability. Data replication has been widely used to improve data availability in distributed systems, and we will apply this technique to MANETs. It is suitable to improve the response time of the access requests, to distribute the load of processing of these requests on several servers and to avoid the overload of the routes of communication to a unique server. |
Figure 1 shows an example of how data replication can be used to improve the performance of data access when network partitions. There are four mobile hosts in the network. MH4 is a web camera which continuously records video clips d1 of its surroundings. Two clients MH1 and MH2 periodically access these video clips by using N3 as relay. However, when a disconnection occurs between MH3 and MH4 due to a link failure, d1 becomes inaccessible to the other three nodes. To improve data availability, a copy of d1 can be replicated at MH3 before the disconnection. Then both MH1 and MH2 can access d1 even if they are not able to connect with MH4. Further, by replicating a copy of d1 at MH3, MH1 and MH2 can access d1 within one hop and thereby reducing the query delay [15]. |
RESEARCH ISSUES RELATED TO DATA REPLICATION
|
Data replication technique for MANET will be having the following issues arising from constraints created by their specific environments and applications [4, 5]: |
A. Power consumption |
Mobile nodes in the MANET are battery powered. If a node with less power is replicated with many frequently accessed data items, it soon gets drained and it cannot provide services any more. Thus replication algorithm should replicate data in the nodes that has sufficient power by periodically checking the remaining power of each node. |
B. Node mobility |
In MANET, nodes are mobile which leads to dynamic topology. Thus replication algorithm has to support mobility prediction such that if a node is likely to move away from the network, its replicas will be replaced in some other nodes which is expected to remain in the network for a particular period of time. |
C. Resource availability |
Since nodes participating in MANET are portable devices, memory capacity is limited. Before sending a replica to the node, the algorithm has to check whether a node has sufficient memory capacity to hold the replica. |
D. Real-time applications |
MANET applications like rescue and military operations are time-critical and may contain both firm and soft realtime transactions. Therefore, the replication technique should be able to deliver correct information before the expiry of transaction deadlines, taking into consideration both real-time firm and soft transaction types in order to reduce the number of transactions missing their deadlines. |
E. Network partitioning |
Due to frequent disconnection of mobile hosts, network partitioning occurs more often in MANET databases than in traditional databases. Network partitioning is a severe problem in MANET when the server that contains the required data is isolated in a separate partition, thus reducing data accessibility to a large extent. Therefore, the replication technique should be able to determine the time at which network partitioning might occur and replicate data items beforehand. |
AN OVERVIEW OF EXISTING TECHNIQUES
|
A. Data Replication Techniques |
L.Qiu et.al, 2001 [6] describes that data replication has been extensively studied in the web environment, where the goal is to place some replicas of the web servers among possible number of locations so that the query delay is minimized. In the web environment, links and nodes are stable. Thus, the performance is mainly measured by the query delay. Moreover, these schemes replicate at the whole database level, that is the entire database is replicated as a unit to one or more locations. This scheme is more complex when replication is done at the data item level, because of limited memory space. |
In distributed database systems, L. Gao et.al, 2003 [7] explained an emerging edge services architecture promises to improve the availability and performance of web services by replicating servers. A key challenge in such systems is data replication and consistency, so that edge server code can manipulate shared data without incurring the availability and performance penalties that would be incurred by accessing a traditional centralized database. This paper explores using distributed object architecture to build an edge service system for an e-commerce application, an online book store represented by the TPC-W benchmark. The experimental result showed that throughput and response time of the system are consistent before, during, and after network partition. By measuring latencies, author showed that the response time of the system closely approximates that of the ideal system, and the system performance is dramatically improved comparing to the traditional architecture. They take advantage of application specific semantics to design distributed objects to manage a specific subset of shared information using simple and effective consistency models. For example, in one experiment they find that their object-based edge server system provides a factor of five improvements in response time over traditional centralized cluster architecture and a factor of nine improvements over an edge service system that distributes code but retains a centralized database. Such systems, nodes that host the database are more reliable and less likely to fail/disconnect compared to those in MANETs. Therefore, a small number of replicas can be used to provide high availability. |
T.Hara, 2001 [8] proposed the effective replica allocation methods in ad hoc networks for improving data accessibility. Author proposed three replica allocation methods to improve data accessibility by replicating data items on mobile hosts i.e. Static Access Frequency (SAF) method, Dynamic Access Frequency and Neighbourhood (DAFN) method and Dynamic Connectivity based Grouping (DCG) method. These techniques make the following assumptions: (i) each data item and each mobile host is assigned a unique identifier, (ii) every mobile host has finite memory space to store replicas; (iii) there are no update transactions; and (iv) the access frequency of each data item, which is the number of times a particular mobile host accesses that data item in a unit time interval, is known and does not change. The decision of which data items are to be replicated on which mobile hosts is based on the data items access frequencies and such a decision is taken during a certain period of time, called the relocation period. In the SAF method, a mobile host allocates replicas with high access frequencies. In the DAFN method, replicas are preliminary allocated based on the SAF method, and then the replica duplication is eliminated among neighbouring mobile hosts. In the DCG method, stable groups of mobile hosts are created, and replicas are shared in each group. The experimental result shows that in most cases the DCG method gives the highest accessibility, and the SAF method gives the lowest traffic. |
T.Hara, 2002, [9, 10] showed an extensions to the above three techniques were proposed to improve data accessibility in an environment where both periodic and aperiodic updates can occur. These extensions assume that every data item is broadcasted periodically. Each data item is updated only by the mobile host that has the original copy of the data item. Replicas of a data item become invalid if the nodes that contain these replicas cannot connect to the node that has its original copy. The proposed extensions are based on the access frequencies of data items [8], as well as their PT values. A PT value of a data item represents the average number of requests that are issued for the data item before it is updated next. This is defined as the product of the access frequency of the data item and the time remaining until this data item is broadcasted next. A data item that has a higher PT value will then be replicated before a data item that has a lower PT value. This is done to ensure effective replication of data items considering the number of accesses prior to the next update. All these three extended techniques, E-SAF (extended SAF), E-DAFN (extended DAFN) and E-DCG (extended DCG), work the same way as SAF, DAFN and DCG except that the PT values, instead of the access frequencies, are used. These schemes are based on the intuition that replicating the same data near neighbouring nodes. |
T.Hara et. al, 2004 [11] proposed an extension of the replication techniques[8] to replicate data based on the data priority of data items that is based on the correlation among data items. Correlation between data items is the probability that clients collectively access a set of data items. Stronger the correlation, higher the probability that a particular set of data items will be accessed together. They assume that clients access a set of correlated data items by submitting multiple access requests at the same time. Apart from the assumptions [8], they assume that the strength of correlation between the data items is known and does not change. Replicas are relocated in a specific period called the relocation period. In every relocation period, the replica allocation is determined based on the correlation between data items and the network topology at that moment. These extended techniques replicate data items based on the access relationship between them. Authors also address the issues caused by host mobility as the decision to replicate data items is taken periodically based on the current topology of the network. However, the techniques might yield the best results only when most of the data are accessed in correlation with each other. |
T.Hara, 2006 [10] assumed that replicas of a data item become invalid after the host holding the original updates and the consistency of data operations on replicas is kept in the entire network. However, since network partitioning frequently occurs in a MANET, this strong consistency management scheme heavily deteriorates the data availability. Moreover, many applications in MANETs do not require such a strong consistency. So, T.Hara et.al, 2009 [12] proposed consistency management strategies for data replication in MANETs. Authors classified different consistency levels according to applications demand and provides protocols to realize them i.e. Global Consistency (GC), Local Consistency (LC), Time-Based Consistency (TC), Peer-Based Consistency (PC). Since mobile hosts’ mobility causes frequent network partitioning, consistency management of data operations on replicas becomes a crucial issue. In such an environment, the global consistency of data operations on replicas is not desirable by many applications. Thus, new consistency maintenance based on local conditions such as location and time is investigated. The MANET consists of two kinds of mobile nodes: proxies and peers. They proposed protocols to achieve consistency conditions of data operations on replicas in MANETs and then, discuss the impact of replica allocation for the system performance when the memory space of mobile hosts is limited. |
J.L. Huang et.al, 2006 [13] addressed the problem of replica allocation in a MANET by exploring group mobility. The growth in wireless communication technologies attracts a considerable amount of attention in mobile ad hoc networks. Since mobile hosts in an ad hoc network usually move freely, the topology of the network changes dynamically and disconnection occurs frequently. These characteristics make it likely for a mobile ad hoc network to be separated into several disconnected partitions, and the data accessibility is hence reduced. Several schemes are proposed to alleviate the reduction of data accessibility by replicating data items. However, little research effort was elaborated upon exploiting the group mobility where the group mobility refers to the phenomenon that several mobile nodes tend to move together. Authors first analyse the group mobility model and derive several theoretical results. In light of these results, they propose a replica allocation scheme, DRAM to improve the data accessibility. Several experiments are conducted to evaluate the performance of the proposed scheme. The experimental results show that the proposed scheme is able to not only obtain higher data accessibility, but also produce lower network traffic than prior schemes. |
V. Ramany et.al, 2008 [14] studied solutions for replicating location dependent data in MANETs to handle unreliable network connections. Replication aims to improve accessibility, shorter response time and fault tolerance. When data is associated with geographical location in the network and valid only within a region around that location, the benefits from replication will apply only within this region. In MANETs, nodes move in and out of a region and can even leave the network completely, which leads to frequent changing of replica-holders. As mobile nodes have usually constrained processing power and memory, replica holders need to be selected carefully in such networks, to reduce communication overhead. This paper proposes a solution for replication of location dependent data in mobile ad hoc networks. It will be shown that an improvement of 20% in hit ratio is achieved in accessing data items with only a moderate increase in total traffic generated. The scalability of the solution with regards to the increase in the number of nodes or data items in the network will also be shown to be good. |
Yang Zhang et.al, 2012 [15] describes, in MANETs, nodes move freely and link/node failures are common, which leads to frequent network partitions. When a network partition occurs, mobile nodes in one partition are not able to access data hosted by nodes in other partitions, and hence significantly degrade the performance of data access. To solve this problem, data replication techniques are used. Existing data replication solutions in wired or wireless networks aim at either reducing the query delay or improving the data availability, but not both. As both metrics are important for mobile nodes, authors propose schemes like One-To-One Optimization (OTOO), Reliable Neighbour (RN) and Reliable Group (RG) to balance the trade-offs between these two metrics under different system settings and requirements. The basic idea is to replicate the most frequently accessed data locally and only rely on neighbour’s memory when the communication link to them is reliable. |
Table 1 shows the features of some data replication techniques in ad hoc and MANETs discussed above. Besides data replication, caching can be used to improve data availability and reduce the query delay. Caching consist of different techniques to improve data accessibility. |
B. Caching Techniques |
Qun Ren et.al, 2003 [16] represent that semantic caching is very attractive for use in distributed systems due to the reduced network traffic and the improved response time. It is particularly efficient for a mobile computing environment, where the bandwidth of wireless links is a major performance bottleneck. The idea of semantic caching is that the client maintains in the cache both semantic descriptions and results of previous queries. If a new query is totally answerable from the semantic cache, no communication with the server is necessary, if it can only be partially answered, the original query is trimmed and the trimmed part is sent to the server and processed there. By processing queries in this way, the amount of data transferred over the network can be substantially reduced. Since a semantic cache is composed of a set of semantic segments, they view the query processing problem from two layers. The first layer deals with how to process a query from a semantic segment and the second layer deals with how to process a query from the whole cache. Authors presented a data structure called the query plan tree, which provides a detailed plan about how to process a query from a cache. With many intrinsic advantages, it can be widely used in client-server environments, mobile computing, heterogeneous systems, web applications, etc. |
F. Sailhan et.al, 2005 [17] proposed a scalable service discovery protocol for MANETs. MANETs conveniently complement infrastructure-based networks, allowing mobile nodes to spontaneously form a network and share their services, including bridging with other networks, either infrastructure based or ad hoc. However, distributed service provisioning over MANETs requires adequate support for service discovery and invocation, due to the network’s dynamics and resource constraints of wireless nodes. While a number of existing service discovery protocols have shown to be effective for the wireless environment, these are mainly aimed at infrastructure-based and/or 1-hop ad hoc wireless networks. Some discovery protocols for MANETs induce significant traffic overhead, and are thus primarily suited for small-scale MANETs with few nodes. While decentralized service discovery is a priori more suited to the specifics of MANETs by not assigning any specific role to given nodes, it is too costly in terms of traffic. But, centralized service discovery based on service directories either makes the protocol too much dependent on specific nodes or induces too much resource consumption in the network. Scalability of their solution further comes from the fact that: (a) proposed protocol minimizes the generated traffic using border casting, and (b) the directory content is summarized using a Bloom filter, which provides a highly compact summary that may be exchanged with other directories, and is used to accurately locate the directory that holds the description of a given service. |
Liangzhong Yin et.al, 2006 [18] designed and evaluated cooperative cache based data access techniques to efficiently support data access in ad hoc networks. Cooperative caching allows the sharing and coordination of cached data among multiple nodes, can further explore the potential of the caching techniques. In this technique after a node sends a data request to the data owner, the data owner sends the data back. Since the data may go through multi-hops before reaching the requester. Intermediate nodes may cache the data or the path to the data. Later, if some other nodes request for the same data and the intermediate nodes can return the data or the path to the data. As the number of hops is reduced, the query delay is also reduced. The proposed schemes are CacheData, CachePath and HybridCache. In CacheData, intermediate nodes cache the data to serve future requests instead of fetching data from the data center. If node caches the same data items, it wastes a large amount of cache space. To avoid this, a conservative rule should be followed: A node does not cache the data if all requests for the data are from the same node. In CachePath, mobile nodes cache the data path and use it to redirect future requests to the nearby node which has the data instead of the faraway data center. HybridCache takes advantage of CacheData and CachePath while avoiding their weaknesses. Specifically, when a node forwards a data item, it caches the data or path based on some criteria. These criteria include the data item size si, the TTL time TTLi, and the Hsave (number of hops that a cached path can save). |
J. Cao et.al, 2007 [19] proposed data consistency for cooperative caching in mobile environments. Cooperative caching improves system performance because it allows sharing and coordination of cached data among multiple mobile users in the network. By cooperatively caching frequently accessed information, mobile devices do not always have to send requests to the data source. In addition to reducing query latency, this technique can lower mobile host communication overhead and energy consumption. However, the unreliable communication and the user’s mobility make it difficult to maintain cache consistency. In response to this problem, authors have developed a 3D model that captures the main features of cache consistency schemes and provides a basis for evaluating existing strategies as well as designing new ones. Based on this model, they propose a hybrid and generic strategy: relay-peer-based cache consistency (RPCC). RPCC uses relay peers between the source hosts and cache hosts to forward updated information, it can divide cache invalidation into two asynchronous procedures. A strategy for maintaining consistency of cooperatively cached data in a mobile wireless network can focus on three design aspects: consistency level, consistency control, and cache status maintenance. There are three levels of cache consistency: strong, delta, and weak. In consistency control, both the source and the cache nodes can initiate consistency checking. The source initiated approach puts the burden of consistency maintenance on the source node, it is a kind of push (PS) operation in that the source node pushes the update message to other cache nodes. In contrast, the cache-initiated approach is a kind of pull (PL) operation that needs individual cache nodes to pull the update messages from the source data. In cache status maintenance, the source node may or may not record the status of its cache nodes. In the stateful (SF) approach, the source keeps the information about which mobile hosts cache which data items. The stateless (SL) approach does not require the source node to be aware of the cache nodes’ state. Relay peer based cache consistency is a novel strategy that combines the advantages of both push-based and pull-based approaches while avoiding their weaknesses. |
B. Tang et.al, 2008 [20] explained that data caching can significantly improve the efficiency of information access in a wireless ad hoc network by reducing the access latency and bandwidth usage. Authors developed a paradigm of data caching techniques to support effective data access in ad hoc networks. In particular, they considered the memory capacity constraint of the network nodes and developed efficient algorithms to determine near-optimal cache placements to maximize reduction in overall access cost. Reduction in access cost leads to communication cost savings and, hence, better bandwidth usage and energy savings. In this paper, they consider the cache placement problem of minimizing total data access cost in ad hoc networks with multiple data items and nodes with limited memory capacity. Cache placement algorithm consist of centralized greedy algorithm(CGA) and distributed greedy algorithm(DGA) .The novel contribution in this work is the development of a 4-approximation centralized algorithm, which is naturally amenable to a localized distributed implementation. The designed centralized algorithm is essentially a greedy approach, and referred it as CGA. The distributed implementation uses only local knowledge of traffic. The advantage of DGA is that it adapts to dynamic traffic conditions and can be easily implemented in an operational network with low communication overheads. |
PROPOSED SOLUTION
|
Authors highlight that data replication techniques is unavoidable in MANETS for improving the data availability. However Survey shows that certain factors like storage issues, link failure, consistency management, power have considered with certain conditions. The proposed solutions in caching [16-20] can reduce the query delay but there is a limitation on the tolerance level that can achieve. For mobile nodes, data availability and query delay are important metrics. From survey, data replication techniques [15] like One-To-One Optimization (OTOO), Reliable Neighbour (RN) and Reliable Group (RG) considers both the metrics. The data replication techniques alone [6-14] either reducing delays in query or improving data availability, not together. But these schemes [15] balance the tradeoffs between query delay and data availability. In OTOO, each mobile node only cooperates with at most one neighbour to decide which data to replicate. In RN, part of the node’s memory is used to hold data for its reliable neighbours. In RG, there is no redundant replication until every data item is replicated at least once. Therefore, the maximum degree of cooperation within the reliable group can be achieved. Authors here also claim that, the proposed schemes outperform the existing methods/process in terms of data availability and query delay. |
CONCLUSION
|
In MANETs, as the nodes move freely, disconnections occur frequently and this causes frequent network partitioning. This will lead to poor data accessibility compared to conventional fixed network. In order to prevent deterioration of data accessibility at the point of network division, we suggest the replication of data items by using One-To-One Optimization scheme, Reliable Neighbour scheme and Reliable Group scheme. Along with the data replication technique, the suggested three techniques will also reduce the query delay. So, authors strongly feel that a balance between data availability and query delay in MANETs may be achieved by the above suggested method [15]. |
Tables at a glance
|
|
Table 1 |
|
|
Figures at a glance
|
|
Figure 1 |
|
|
References
|
- S. C.Sivaram Murthy and B.S Manoj,”Ad Hoc Wireless Networks ", Pearson Education, Second Edition India, 2001.
- MANET Tutorial by Nithin Vaidya, INFOCOM, 2006.
- Lixin Wang ,”File Sharing on a mobile ad hoc Network”, Master Thesis,Department of Computer Science at the University of Saskatchewan, Canada, 2003.
- Prasanna, Dr.Greenwald,”Managing data replication in mobile ad hoc networks database”, International Conference on Collaborative Computing , IEEE Explorer, pp. 1-10, 2006.
- P. Padmanabhan, L. Gruenwald, A. Vallur, and M. Atiquzza, “A survey of data replication techniques for mobile ad hoc network databases,” The VLDB Journal, Vol. 17, pp. 1143–1164, 2008.
- L. Qiu, V. N. Padmanabhan, and G. M. Voelker, “On the placement of web server replicas,” IEEE INFOCOM, pp. 1587-1596, 2001.
- L. Gao, M. Dahlin, A. Nayate, J. Zheng, A. Iyengar, “Consistency and replication: application specific data replication for edge services,” International Conference on World Wide Web, 2003.
- T. Hara, “Effective replica allocation in ad hoc networks for improving data accessibility,” IEEE INFOCOM, 2001.
- T. Hara, “Replica allocation in ad hoc networks with periodic data update,” International Conference on Mobile Data Management (MDM), 2002.
- T. Hara and S. K. Madria, “Data replication for improving data accessibility in ad hoc networks,” IEEE Transactions on Mobile Computing, Vol. 5, No. 11, pp. 1515–1532, 2006.
- Hara.T, Murakami.N, Nishio.S, Replica allocation for correlated data items in ad hoc sensor networks, ACM SIGMOD RECORD, pp. 38–43, 2004.
- T. Hara and S. Madria, “Consistency management strategies for data replication in mobile ad hoc networks,” IEEE Transactions on Mobile Computing, Vol. 8, No. 7, pp. 950–967, 2009.
- J.L. Huang and M.-S. Chen, “On the effect of group mobility to data replication in ad hoc networks,” IEEE Transactions on Mobile Computing, Vol. 5, No. 5, pp. 492–507, 2006.
- V. Ramany and P. Bertok, “Replication of location-dependent data in mobile ad hoc networks,” ACM Mobile, pp. 39–46, 2008.
- Yang Zhang et.al, “Balancing the Trade-Offs between Query Delay and Data Availability in MANETs”, IEEE Transactions on Parallel and Distributed Systems, Vol. 23, No. 4, pp.643-650, 2012.
- Q. Ren, M. Dunham, and V. Kumar, “Semantic caching and query processing,” IEEE Transactions on Knowledge and Data Engineering, Vol. 15, No. 1, pp. 192–210, 2003.
- F. Sailhan and V. Issarny, “Scalable service discovery for MANET”, IEEE International Conference on Pervasive Computing and Communications, pp. 235–244, 2005.
- L. Yin and G. Cao, “Supporting cooperative caching in ad hoc networks,” IEEE Transaction on Mobile Computing, Vol. 5, No. 1, pp. 77-89, 2006.
- J. Cao, Y. Zhang, G. Cao, and L. Xie, “Data consistency for cooperative caching in mobile environments,” IEEE Computer, Vol. 40, No. 4, pp. 60–66, 2007.
- B. Tang, H. Gupta, and S. Das, “Benefit-based data caching in ad hoc networks,” IEEE Transactions on Mobile Computing, Vol. 7, No. 3,pp. 289–304, 2008.
|