CQMultiple Class Reference

#include <CQMultiple.h>

Inheritance diagram for CQMultiple:

CQuery CMagic

List of all members.

Public Member Functions

 CQMultiple ()
 ~CQMultiple ()
 CQMultiple (CAccessorAdminCollection &inAccessorAdminCollection, CAlgorithm &inAlgorithm)
virtual CIDRelevanceLevelPairListfastQuery (const CXMLElement &inQuery, int inNumberOfInterestingImages, double inDifferenceToBest)
virtual CXMLElementquery (const CXMLElement &inQuery)
virtual bool setAlgorithm (CAlgorithm &inAlgorithm)

Static Public Member Functions

static void * doFastQueryThread (void *)
static void * doQueryThread (void *)

Protected Member Functions

void init ()

Protected Attributes

bool mUsesResultURLs


Detailed Description

This is going to be one of our main building blocks. It is a structure which contains a couple of CQuery structures, hands a query through to them, and then unifies the result. In fact this is the center of all this query tree business.

Probably we will put another layer into the class tree: The CQTreeNode, but let's first start.

Important: The basic assumption here is, that all children operate on the same collections. If this is not the case we have to be more careful, and most of all: we have to operate using URLs.

[in the following I am talking about things I WANT to do, so the two modes stuff is not yet implemented]

CQMultiple has two minor modes:

Merge-by-ID or Merge-by-URL

In the first case, we need information on how to translate image IDs to image URLs. We dispatche a fastQuery() to each child node, and then we merge the results (by imageID). The resulting list of ID-relevancelevel pairs is translated back to URLs using an URL2FTS accessor.

Please note that I am aware that this needs refactoring: we should have an ULRToID accessor superclass, which provides the necessary translation services, without being fixed on a given data representation.

In the second case, we do not need any additional information: we dispatch a query() (as opposed to fastQuery()) to the child nodes, and then we merge the results. This means we have to merge plenty of XML.

: Wolfgang Müller

Definition at line 118 of file CQMultiple.h.


Constructor & Destructor Documentation

CQMultiple::CQMultiple (  ) 

default constructor

Definition at line 61 of file CQMultiple.cc.

00061                       {
00062   assert(0);
00063 };

CQMultiple::~CQMultiple (  ) 

we need to unregister the accessors used

destructor: at present empty

Definition at line 90 of file CQMultiple.cc.

00090                        {
00091 
00092   cout << "destroying this "
00093        << __FILE__
00094        << __LINE__
00095        << flush
00096        << endl;
00097 
00098   //i thought i will need this, but at present I do not have this impression
00099   //it does not hurt, so we leave it in
00100 };

CQMultiple::CQMultiple ( CAccessorAdminCollection inAccessorAdminCollection,
CAlgorithm inAlgorithm 
)

In fact, what we are doing here is to get ourselves an accessor ACURL2FTS to do a proper fastQuery

constructor see CQuery

Definition at line 68 of file CQMultiple.cc.

References CXMLElement::boolReadAttribute(), CQuery::mAccessor, CQuery::mAccessorAdmin, mUsesResultURLs, and CAccessorAdmin::openAccessor().

00069                                             :
00070   CQuery(inAccessorAdminCollection,
00071    inAlgorithm){
00072   {
00073 
00074     pair<bool,bool> lUsesURLs(inAlgorithm.boolReadAttribute("cui-uses-result-urls"));
00075 
00076     mUsesResultURLs=(lUsesURLs.first && lUsesURLs.second);
00077 
00078     // mproxy has been filled in a reasonable way 
00079     // by CQuery::CQuery
00080     mAccessor=mAccessorAdmin->openAccessor("url2fts");
00081     assert(mAccessor);
00082   }
00083 };
    


Member Function Documentation

void CQMultiple::init (  )  [protected, virtual]

Do we merge the results by their URL or by their image ID?

We read the algorithm to find out

how we will dispatch the queries to the child nodes

in which way we will merge the results coming from the child nodes

Implements CQuery.

Definition at line 49 of file CQMultiple.cc.

00049                      {
00050   
00051 
00052   
00053 
00054 };

void * CQMultiple::doFastQueryThread ( void *  inParameter  )  [static]

This function processes is in the inner loop of fastQuery. If multithreading is possible on the system on which GIFT was compiled, then this function will run in a thread, and fastQuery will wait for it

Definition at line 344 of file CQMultiple.cc.

References CQuery::fastQuery(), CQMThread::mDifferenceToBest, CQMThread::mFastResult, CQMThread::mQuery, CQMThread::mQueryProcessor, CQMThread::mResult, and CQMThread::mResultSize.

Referenced by CQMThread::runFastThread().

00344                                                     {
00345   class CQMThread* lParam((CQMThread*) inParameter);
00346   lParam->mFastResult=lParam->mQueryProcessor.fastQuery(*(lParam->mQuery),
00347                 lParam->mResultSize,
00348                 lParam->mDifferenceToBest);
00349   cout << "I AM FINISHED HERE " << lParam << "result" << lParam->mResult << endl;
00350   return 0;
00351 }

CIDRelevanceLevelPairList * CQMultiple::fastQuery ( const CXMLElement inQuery,
int  inNumberOfInterestingImages,
double  inDifferenceToBest 
) [virtual]

calls fastQuery for every child, merges the results

calls fastQuery for every child, merges the results and translates them back into URLs

NEW, MORE EFFICIENT VERSION

Implements CQuery.

Definition at line 369 of file CQMultiple.cc.

References CQMThread::CQMThread(), CQuery::mAccessor, CQuery::mChildren, and CAccessor::size().

00371                                        {
00372 
00373   cout << "CMultiple Number of children:"
00374        << mChildren.size()
00375        << endl;
00376 
00377 
00378   //list<CWeightedResult> lTemporary;
00379   double lWeightSum(0);
00380 
00381   map<TID,CIDRelevanceLevelPair> lResultMap;
00382 
00383 
00384   // no mutex protection needed as this is not called except by main thread
00385 
00386   //
00387   // rip this into two parts
00388   // in order to 
00389   // make it possible to run the querying in one thread
00390   // and do the merging after having waited for each thread
00391   //
00392 
00393   list<CQMThread> lListOfThreads;
00394 
00395   lCChildren::const_iterator lLast=mChildren.end();
00396   lLast--;
00397   for(lCChildren::const_iterator i=mChildren.begin();
00398       i!=mChildren.end();
00399       i++){
00400     lWeightSum+=i->mWeight;
00401     
00402     //lTemporary.push_back(CWeightedResult());
00403     
00404     cout << "çç---------------------this CMultiple:fastQuery" << this 
00405    << ", i->mQuery:" << i->mQuery 
00406    << ", i->mWeight:" << i->mWeight 
00407    << endl;
00408     
00409     
00410     lListOfThreads.push_back(CQMThread(*(i->mQuery),          // The Query processor to choose
00411                inQuery,            // the query to be processed
00412                i->mWeight,         // the weight the result will receive
00413                mAccessor->size(),  // the size of the accessor (to get all potential results)
00414                inDifferenceToBest));// the difference to the best which is allowed for a result
00415     /* EX-LEAK
00416        the following was a special branch for reducing the
00417        number of spawned threads by one. Apparently this did 
00418        not work and caused a memory leak. Now it seems to work.
00419        
00420      if(1==0 
00421      && (i==lLast)){
00422      lListOfThreads.back().callFunction();//something to do for the main thread
00423      }else*/
00424       
00425     {
00426       cout << "Running thread" 
00427      << endl;
00428       lListOfThreads.back().runFastThread();//run the thread
00429       cout << "loop" 
00430      << endl;
00431     }
00432     cout << "endloop" 
00433    << endl;
00434   }
00435   // here we would join all threads
00436 
00437   for(list<CQMThread>::iterator lThread=lListOfThreads.begin();
00438       lThread!=lListOfThreads.end();
00439       lThread++){
00440 
00441     cout << "joining..." << endl;
00442 
00443     lThread->join();
00444 
00445     cout << "before merging " << endl;
00446 
00447     if(!lThread->mFastResult){
00448       cout << "THE THE RESULT OF THIS THREAD WAS NIL " 
00449      << endl;
00450     }
00451     if(lThread->mFastResult){
00452       for(CIDRelevanceLevelPairList::iterator i=lThread->mFastResult->begin();
00453     i!=lThread->mFastResult->end();
00454     i++){
00455   
00456   map<TID,CIDRelevanceLevelPair>::const_iterator lFound=lResultMap.find(i->getID());
00457 
00458   i->setRelevanceLevel(i->getRelevanceLevel()*lThread->getWeight());
00459   
00460   if(lFound==lResultMap.end()){
00461 
00462     lResultMap.insert(make_pair(i->getID(),
00463               *i));
00464   }else{
00465     
00466     lResultMap[i->getID()].setRelevanceLevel(lResultMap[i->getID()].getRelevanceLevel()
00467                +i->getRelevanceLevel()
00468                );
00469   }
00470       }
00471       delete lThread->mFastResult;
00472     }
00473 
00474     
00475     cout << "after merging " << endl;
00476   }
00477   
00478   CIDRelevanceLevelPairList* lReturnValue=new CIDRelevanceLevelPairList();
00479 
00480   cout << "<pushing>"
00481        << endl;
00482 
00483   for(map<TID,CIDRelevanceLevelPair>::const_iterator i=lResultMap.begin();
00484       i!=lResultMap.end();
00485       i++){
00486     lReturnValue->push_back(i->second);
00487   }
00488 
00489   cout << "</pushing>\n<sorting>"
00490        << endl;
00491 
00492   lReturnValue->sort();
00493   lReturnValue->reverse();
00494   cout << "Size of the result "
00495        << lReturnValue->size()
00496        << endl;
00497   cout << "</sorting>"
00498        << endl;
00499 
00500   cout << "<cutting>"
00501        << endl;
00502   {
00503     CIDRelevanceLevelPairList::iterator iSkip=lReturnValue->begin();
00504     for(int i=0;
00505   i<inNumberOfInterestingImages && i<lReturnValue->size();
00506   i++){
00507       iSkip->setRelevanceLevel(iSkip->getRelevanceLevel()/lWeightSum);
00508       iSkip++;
00509     }
00510     lReturnValue->erase(iSkip,lReturnValue->end());
00511   }
00512   cout << "</cutting>"
00513        << endl;
00514   return lReturnValue;
00515 };

void * CQMultiple::doQueryThread ( void *  inParameter  )  [static]

This function processes is in the inner loop of query. If multithreading is possible on the system on which GIFT was compiled, then this function will run in a thread, and fastQuery will wait for it

do the query thread, but starting query

Definition at line 355 of file CQMultiple.cc.

References CQMThread::mQuery, CQMThread::mQueryProcessor, CQMThread::mResult, and CQuery::query().

Referenced by CQMThread::runThread().

00355                                                 {
00356   class CQMThread* lParam((CQMThread*) inParameter);
00357   lParam->mResult=lParam->mQueryProcessor.query(*(lParam->mQuery));
00358   cout << "I AM FINISHED HERE " << lParam << "result" << lParam->mResult << endl;
00359   return 0;
00360 }

CXMLElement * CQMultiple::query ( const CXMLElement inQuery  )  [virtual]

calls query for every child, merges the results by URLs

calls fastQuery for every child, merges the results and translates them back into URLs

NEW, MORE EFFICIENT VERSION

Reimplemented from CQuery.

Definition at line 592 of file CQMultiple.cc.

References CXMLElement::addAttribute(), CXMLElement::addChild(), mrml_const::calculated_similarity, CXMLElement::child_list_begin(), CXMLElement::child_list_end(), CXMLElement::clone(), mrml_const::error, CQuery::getRandomImages(), mrml_const::image_location, CXMLElement::longReadAttribute(), CQuery::mAccessor, CQuery::mChildren, mrml_const::message, CXMLElement::moveUp(), mUsesResultURLs, CQuery::query(), mrml_const::query_result, mrml_const::query_result_element, mrml_const::query_result_element_list, mrml_const::result_cutoff, mrml_const::result_size, CAccessor::size(), and mrml_const::thumbnail_location.

00592                                                         {
00593 
00594   if(!mUsesResultURLs){
00595     // if the mUsesReusltURLs is not set,
00596     // just call fastquery, and assemble from that a result,
00597     // as CQuery does.
00598     return CQuery::query(inQuery);
00599   }
00600   
00601   pair<bool,long> lNumberOfInterestingImages=
00602     inQuery.longReadAttribute(mrml_const::result_size);
00603   
00604   int inNumberOfInterestingImages=
00605     lNumberOfInterestingImages.second;
00606 
00607   pair<bool,long> lCutoff=
00608     inQuery.longReadAttribute(mrml_const::result_cutoff);
00609   
00610   int inCutoff=
00611     lCutoff.second;
00612 
00613   // do a deep clone of the query (for const cast)
00614   CXMLElement* lQuery=inQuery.clone(1);
00615 
00616   if(lQuery->child_list_begin()!=lQuery->child_list_end()){
00617 
00618     //
00619     // set the result size to a multiple of the 
00620     // number of the images requested 
00621     // to get a higher probability that
00622     // the combination reflects the real score
00623     // you want more explanation?
00624     // you get it at help-gift@gnu.org
00625     //
00626     lQuery->addAttribute(mrml_const::result_size,
00627        long(inNumberOfInterestingImages*5));
00628     
00629     cout << "CMultiple::query Number of children:"
00630    << mChildren.size()
00631    << endl;
00632     
00633     
00634     //list<CWeightedResult> lTemporary;
00635     double lWeightSum(0);
00636     
00637     map<string,CMergeTriplet> lResultMap;
00638     
00639     // no mutex protection needed as this is not called except by main thread
00640     
00641     //
00642     // rip this into two parts
00643     // in order to 
00644     // make it possible to run the querying in one thread
00645     // and do the merging after having waited for each thread
00646     //
00647     
00648     list<CQMThread> lListOfThreads;
00649 
00650     lCChildren::const_iterator lLast=mChildren.end();
00651     lLast--;
00652     for(lCChildren::const_iterator i=mChildren.begin();
00653   i!=mChildren.end();
00654   i++){
00655       lWeightSum+=i->mWeight;
00656     
00657       //lTemporary.push_back(CWeightedResult());
00658     
00659       cout << "**-------------------------------this CMultiple QUERY:" << this 
00660      << ", i->mQuery:" << i->mQuery 
00661      << ", i->mWeight:" << i->mWeight 
00662      << endl;
00663     
00664     
00665       lListOfThreads.push_back(CQMThread(*(i->mQuery),          // The Query processor to choose
00666            *lQuery,            // the query to be processed
00667            i->mWeight,         // the weight the result will receive
00668            mAccessor->size(),  // the size of the accessor (to get all potential results)
00669            inCutoff));// the difference to the best which is allowed for a result
00670       /* EX-LEAK
00671    the following was a special branch for reducing the
00672    number of spawned threads by one. Apparently this did 
00673    not work and caused a memory leak. Now it seems to work.
00674        
00675    if(1==0 
00676    && (i==lLast)){
00677    lListOfThreads.back().callFunction();//something to do for the main thread
00678    }else*/
00679       
00680       {
00681   cout << "Running thread" 
00682        << endl;
00683   lListOfThreads.back().runThread();//run the thread
00684   cout << "loop" 
00685        << endl;
00686       }
00687       cout << "endloop" 
00688      << endl;
00689     }
00690     // here we would join all threads
00691 
00692     for(list<CQMThread>::iterator lThread=lListOfThreads.begin();
00693   lThread!=lListOfThreads.end();
00694   lThread++){
00695 
00696       cout << "joining..." << endl;
00697 
00698       lThread->join();
00699 
00700       cout << "before merging " << endl;
00701 
00702       if(!lThread->mResult){
00703   cout << "THE THE RESULT OF THIS THREAD WAS NIL " 
00704        << endl;
00705       }
00706       if(lThread->mResult){
00707   /*
00708     OK. At this point we got back a query-result XML element.
00709     now we want the result-element-list
00710   */
00711   cout << "H" << flush;
00712   for(CXMLElement::lCChildren::const_iterator i=lThread->mResult->child_list_begin();
00713       i!=lThread->mResult->child_list_end();
00714       i++){
00715     cout << "I" << flush;
00716     if((*i)->getName() == mrml_const::query_result_element_list){
00717       for(CXMLElement::lCChildren::const_iterator j=(*i)->child_list_begin();
00718     j!=(*i)->child_list_end();
00719     j++){
00720         cout << "J" << flush;
00721         if((*j)->getName() == mrml_const::query_result_element){
00722     cout << "K" << flush;
00723     /* 
00724        inside this, *j points now to an XML element which
00725        is a query-result-element. from this we will read now
00726        the calculated-relevance,
00727        as well as the image and thumbnail location.
00728     */
00729     pair<bool,double> lCalculatedRelevance=
00730       (*j)->doubleReadAttribute(mrml_const::calculated_similarity);
00731     pair<bool,string> lImageLocation=
00732       (*j)->stringReadAttribute(mrml_const::image_location);
00733     pair<bool,string> lThumbnailLocation=
00734       (*j)->stringReadAttribute(mrml_const::thumbnail_location);
00735 
00736     cout << "L" << flush;
00737 
00738     // no relevance corresponds to relevance 0
00739     if(! lCalculatedRelevance.first){
00740       lCalculatedRelevance=make_pair(bool(0),
00741              double(0));
00742     }
00743 
00744     cout << "L" << flush;
00745     // if there is a thumbnail and no image,
00746     // we take the thumbnail location as image location
00747     if((lThumbnailLocation.first)
00748        && (!lImageLocation.first)){
00749       lImageLocation=lThumbnailLocation;
00750     }
00751     cout << "L" << flush;
00752     // if there is an image and no thumbnail,
00753     // we take the image location as thumbnail location
00754     if((!lThumbnailLocation.first)
00755        && (lImageLocation.first)){
00756       lThumbnailLocation=lImageLocation;
00757     }
00758     cout << "L" << flush;
00759         
00760     // now we are guaranteed to have a well-initialised
00761     // image location
00762     if(lImageLocation.first){
00763       map<string,CMergeTriplet>::iterator lFound(lResultMap.find(lImageLocation.second));
00764       
00765       if(lFound!=lResultMap.end()){
00766         cout << "A" << flush;
00767         lFound->second.addToRelevance(lCalculatedRelevance.second);
00768         cout << "[" << lFound->second.getCalculatedSimilarity() << "]" << flush;
00769       }else{
00770         // this result is not yet in the result map
00771         lFound=lResultMap.insert(make_pair(lImageLocation.second,CMergeTriplet(lImageLocation.second,
00772                         lThumbnailLocation.second))).first;
00773         lFound->second.addToRelevance(lCalculatedRelevance.second);
00774       }
00775     cout << "M" << flush;
00776 
00777     }
00778         }
00779       }
00780     }
00781   }
00782   delete lThread->mFastResult;
00783       }
00784 
00785     
00786     
00787       cout << "after thread " << endl;
00788     }
00789     cout << "ALL THREADS FINISHED " << endl;
00790   
00791     {
00792       // now we build a list of merge triplets
00793       // that is sorted by score in descending order
00794       list<CMergeTriplet> lResultList;
00795       for(map<string,CMergeTriplet>::const_iterator i=lResultMap.begin();
00796     i!=lResultMap.end();
00797     i++){
00798   lResultList.push_back(i->second);
00799       }
00800       cout << "DELETING " << endl;
00801       lResultList.sort(CSortDescendingByRelevance_MT());
00802       cout << "DELETING " << endl;
00803     
00804       {
00805   list<CMergeTriplet>::iterator iSkip=lResultList.begin();
00806   for(int i=0;
00807       i<inNumberOfInterestingImages && i<lResultList.size();
00808       i++){
00809     iSkip->setSimilarity(iSkip->getCalculatedSimilarity()/lWeightSum);
00810     iSkip++;
00811   }
00812   lResultList.erase(iSkip,lResultList.end());
00813       }
00814 
00815       // now let's build a result element tree
00816       CXMLElement* lReturnValue(new CXMLElement(mrml_const::query_result,0));
00817       CXMLElement* lReturnList(new CXMLElement(mrml_const::query_result_element_list,0));
00818       lReturnValue->addChild(lReturnList);
00819     
00820       assert(mAccessor);
00821     
00822       for(list<CMergeTriplet>::const_iterator i=lResultList.begin();
00823     i!=lResultList.end();
00824     i++){
00825       
00826   CXMLElement* lReturnElement(new CXMLElement(mrml_const::query_result_element,
00827                 0));
00828   {
00829     double lRelevanceLevel(i->getCalculatedSimilarity());
00830     string lString(mrml_const::calculated_similarity);
00831     lReturnElement->addAttribute(lString,
00832                lRelevanceLevel);
00833   }
00834       
00835       
00836       
00837   {
00838     string lURL(i->getImageLocation());
00839     string lString(mrml_const::image_location);
00840     lReturnElement->addAttribute(lString,
00841                lURL);
00842   }
00843 
00844   {
00845     string lURL(i->getThumbnailLocation());
00846     string lString(mrml_const::thumbnail_location);
00847     lReturnElement->addAttribute(lString,
00848                lURL);
00849   }
00850       
00851   lReturnValue->addChild(lReturnElement);
00852       
00853   lReturnValue->moveUp();
00854       
00855       }
00856       //gMutex->unlock();//debugging
00857       return lReturnValue;
00858     }
00859   }else{
00860     //gMutex->unlock();//debugging
00861     return getRandomImages(inNumberOfInterestingImages);
00862   }
00863 
00864   //gMutex->unlock();//debugging
00865   // missing sort and output
00866   list<pair<string,string> > lAttributes;
00867   lAttributes.push_back(make_pair(mrml_const::message,
00868           string("empty query result, i seem to have missed all ifs and elses!")));
00869   return new CXMLElement(mrml_const::error,lAttributes);
00870 };

bool CQMultiple::setAlgorithm ( CAlgorithm inAlgorithm  )  [virtual]

set the Algorithm. same scheme as in setCollection

everything happening in the children

Reimplemented from CQuery.

Definition at line 113 of file CQMultiple.cc.

References CAccessorAdmin::closeAccessor(), CAlgorithm::getCollectionID(), CAccessorAdminCollection::getProxy(), CQuery::mAccessor, CQuery::mAccessorAdmin, CQuery::mAccessorAdminCollection, CQuery::mAlgorithm, CAccessorAdmin::openAccessor(), and CQuery::setAlgorithm().

00113                                                      {
00114   if(mAlgorithm && mAlgorithm->getCollectionID()==inAlgorithm.getCollectionID()){
00115     
00116     return true;
00117     
00118   }else{
00119     //close the old collection, if exsisting
00120     if(mAccessorAdmin)
00121       mAccessorAdmin->closeAccessor("url2fts");
00122     //
00123     mAccessorAdmin=&mAccessorAdminCollection->getProxy(inAlgorithm.getCollectionID());
00124     mAccessor=mAccessorAdmin->openAccessor("url2fts");
00125 
00126     assert(mAccessor);
00127     //
00128     return (CQuery::setAlgorithm(inAlgorithm) && mAccessor);
00129   }
00130 };


Member Data Documentation

do we merge result URLs or result IDs?

Definition at line 129 of file CQMultiple.h.

Referenced by CQMultiple(), and query().


The documentation for this class was generated from the following files:

Generated on Wed Jan 7 00:31:13 2009 for Gift by  doxygen 1.5.6