<?xml version="1.0" encoding="UTF-8"?><rss version="2.0">
<channel>
<title>phantom problem</title>
<link>http://www.computersight.com/tags/phantom problem</link>
<description>New posts about phantom problem</description>
<item>
<title>The Phantom Problem</title>
<link>http://www.computersight.com/Programming/The-Phantom-Problem.39888</link>
<description>
<![CDATA[<p>The Phantom Problem can occur in database management systems in particular scenarios involving concurrency, and can cause unexpected and faulty behavior if it's not handled correctly. Let me first begin with an example.</p>


 <h3>Finding the Oldest</h3>


 <p>Let's say that transaction T1 is accessing a relation Employee(<STRONG>eid</STRONG>, name, seclevel, age), and is looking for the oldest employee with security level (seclevel) of 1, and the oldest employee with seclevel = 2. We assume that the DBMS is using page locking. T1 identifies and locks all pages containing entries where seclevel = 1, the oldest of which is 46.</p>


 <p>At this point, a transaction T2 starts executing an update, as a new employee has been hired. It creates an exclusive lock on a page that T1 is not reading, as it doesn't contain any entries with seclevel = 1. Then it inserts the new entry, where the age is 54. Then it removes the oldest employee with seclevel = 2, whose age is 65, and commits.</p>


 <p>At this point, T1 locks all pages containing entries with seclevel = 2, finds the oldest, who is 62, and returns the results.</p>


 <h3>What happened here?</h3>


 <p>The result of these interleaved operations is that T1 returns the ages 46 and 62. If the transactions had been executed serially, the result would have been either 46 and 65 (T1 first, then T2) or 54 and 62 (T2 first, then T1).</p>

 <p>As you can see, the result is not identical to any serial execution of the two transactions, which a robust DBMS should guarantee.</p>

 <h3>A Solution</h3>

 <p>To find a cure for this oddity, we can take one of two approaches, depending of the structure of the DB.</p>

 <p><ul>
  <li> If the DB has an index on the seclevel field, T1 must identify the page containing the index, and lock the ones that contain an entry with seclevel = 1. If no such entry exists, it must lock the page where there would be an entry with seclevel = 1, if one were created. This is called index locking, and prevents creation of a new entry, as the index page for the new entry is locked.</li>
  <li> If there is no index on the seclevel field, T1 must lock all existing pages for reading, as well as ensuring that no new page can be created. Thus no new entry can be created.  </li>
 </ul></p>

 <p>Both techniques ensures no modification of existing entries with seclevel = 1, and that no new entry of this type can be inserted.</p>

 <h3>Conclusion</h3>

 <p>The two possible solutions available for this issue generates new requirements for the DBMS. For instance, the DBMS must let T1 be able to identify the correct index page to lock, and it must let T1 prevent creation of new pages.</p>

 <p>This only demonstrates a tiny piece of the massive complexity of robust and consistent DBMS.</p><a href="http://www.pheedo.com/click.phdo?x=&u=http%3A%2F%2Fwww.computersight.com%2FProgramming%2FThe-Phantom-Problem.39888"><img src="http://www.pheedo.com/img.phdo?x=&u=http%3A%2F%2Fwww.computersight.com%2FProgramming%2FThe-Phantom-Problem.39888" border="0"/></a>]]></description>
<pubDate>Wed, 06 Jun 2007 07:49:56 PST</pubDate></item>
</channel>
</rss>
