您的当前位置:首页正文

On the Practical Irrelevance of Diverging Implication between Probabilistic Conditional Ind

2022-09-21 来源:图艺博知识网
OnthePracticalIrrelevanceofDivergingImplicationbetweenProbabilisticConditionalIndependenceandEmbeddedMultivaluedDependency

C.J.Butz1andP.Lingras2

1

DepartmentofComputerScience,UniversityofRegina

Regina,Saskatchewan,CanadaS4S0A2

E-mail:butz@cs.uregina.ca

2

DepartmentofMathandComputerScience,SaintMary’sUniversity

Halifax,NovaScotia,Canada,B3H3C3.E-mail:Pawan.Lingras@stmarys.ca

Abstract.Bayesiannetworksserveasthebasisfordevelopingproba-bilisticexpertsystemsandhavebeenappliedwidelyinartificialintel-ligence.PreviousresearchhasarguedthatBayesiannetworksandre-lationaldatabasesaredifferentbyshowingthatthelogicalimplicationofconditionalindependence(CI)andembeddedmultivalueddependency(EMVD)donotalwayscoincide.Inthispaper,weshowthatthistheo-reticaldifferencehasnopracticalimpactwhendesigningprobabilisticex-pertsystems.Therefore,thisworkaddstothemountingevidenceclearlyindicatingthattheimplementationofprobabilisticexpertsystemscantakeadvantageofconventionalrelationaldatabasemanagementsystems.

1Introduction

Bayesiannetworks[3,9,10,12,19,20,26]areanestablishedframeworkforuncer-taintymanagementinartificialintelligence.Forinstance,theyhavebeenappliedinbuildingintelligentagents,suchasOfficeAssistant,andadaptiveuserinter-facesbyMicrosoft[5,7,13],processcontrolbyNASA[6,8]andLockheed[14],softwarediagnosisbyHewlettPackard[2,16]andNokia[11],andmedicaldi-agnosissuchastheHeartDiseaseProgramattheMassachusettsInstituteofTechnology[15]andthePathfinderProjectforlymph-nodediseasesatStanfordUniversity[4].ABayesiannetworkconsistsofadirectedacyclicgraph(DAG)andacorrespondingsetofconditionalprobabilitytables(CPTs).Bytheproba-bilisticconditionalindependencies(CIs)[24]encodedintheDAG,theproductoftheCPTsisajointprobabilitydistribution.Thereby,Bayesiannetworksprovideaclearsemanticmodellingtool,whichfacilitatetheacquisitionofprobabilisticknowledge.

BeforeBayesiannetworkswereproposed,therelationaldatabasemodel[1,18]alreadyestablisheditselfasthebasisfordesigningandimplementingdatabase

systems.Datadependencies1,suchasembeddedmultivalueddependency(EMVD),(nonembedded)multivalueddependency(MVD)andjoindependency(JD),areusedtoprovideaneconomicalrepresentationofauniversalrelation.AsinthestudyofBayesiannetworks,twoofthemostimportantresultsaretheabilitytospecifyauniversalrelationasalosslessjoinofseveralsmallerrelations,andthedevelopmentofefficientmethodstoonlyaccesstherelevantportionsofthedatabaseinqueryprocessing.

Studeny[22]hasarguedthatBayesiannetworksandrelationaldatabasesaredifferentbyshowingthatthelogicalimplicationofCIdoesnotalwayscoin-cidewiththatofEMVD.In[24],werespondedbypointingoutthatStudeny’scounter-exampleisbasedonclassesofdependencieswithoutacompleteaxiom-atization.Inthedesignofprobabilisticexpertsystemsandrelationaldatabases,usuallyonlyclassesofdependencieswithcompleteaxiomatizationsareconsid-ered.Thus,weconcludedin[24]thatStudeny’spointwasmoot.However,onemaystillbelievethattherestillexistsapracticaldifference,ifoneconsiderssituationslikethosegivenin[22].

Inthispaper,ourobjectiveistoshowthatthereisnopracticalconsequenceevenifoneconsidersthetheoreticaldifferencegivenin[22].Weestablishthisresultbyformalizingthetransformationbetweentraditionalrelationsandproba-bilisticrelationswiththenotionofobtainable.Moreover,Lee[17]hasproventhatEMVDisanecessaryconditionforCI.Weusethesetwoconceptstoshowthattheonlyprobabilisticrelations,consideredinpracticewhendesigningaprob-abilisticexpertsystem,mustnecessarilybeobtainedfromtraditionalrelationsagreeingwiththelogicalimplicationofCI.Hence,thisworkisyetanotherex-ample[23–25]emphasizingtheintrinsicrelationshipbetweenBayesiannetworksandrelationaldatabases.

Thispaperisorganizedasfollows.Section2reviewsbackgroundknowledge.Relationshipsbetweentraditonalandprobabilisticrelations,aswellasEMVDandCI,arediscussedinSection3.InSection4,weestablishourmainresult,namely,thetheoreticaldifferencein[22]hasnoconsequenceinpractice.Con-clusionsaremadeinSection5.

2BackgroundKnowledge

Inthissection,wereviewEMVD,CI,andthelogicalimplicationofCIandEMVD.2.1

EmbeddedMultivaluedDependency

HerewegiveaquickintroductiontoEMVD,subclassesofwhichplayanimpor-tantroleinrelationaldatabaseschemadesign[1,18].

ArelationschemeR={A1,A2,...,Am}isafinitesetofattributes(attributenames).CorrespondingtoeachattributeAiisanonemptyfinitesetdom(Ai),1≤i≤m,calledthedomainofAi.LetD=dom(A1)∪dom(A2)...∪dom(Am).ArelationrontherelationschemeR,writtenr(R),isafinitesetofmappings{t1,t2,...,ts}fromRtoDwiththerestrictionthatforeachmappingt∈r,t(Ai)mustbeindom(Ai),1≤i≤m,wheret(Ai)denotesthevalueobtainedbyrestrictingthemappingtoAi.Themappingsarecalledtuplesandt(A)iscalledtheA-valueoft.Weuset(X)intheobviouswayandcallittheX-valueofthetuplet,whereX⊆Risanarbitrarysetofattributes.

Example1.ConsiderfourbinaryattributesA,B,CandD.Onetraditonalre-lationr(ABCD)isshownontheleftofFig.1.

ABCD

r(ABCD)=

001110000101011

001011

000111

Fig.1.Thetraditionalrelationr(ABCD)satisfiestheEMVDB→→A|C,sinceπABC(r)=πAB(r)󰀌󰀍πBC(r).

LetrbearelationonRandXasubsetofR.TheprojectionofrontoX,writtenπX(r),isdefinedas:

πX(r)={t(X)|t∈r}.

(1)

Thenaturaljoinoftworelationsr1(X)andr2(Y),writtenr1(X)󰀡󰀢r2(Y),isdefinedas:

r1(X)󰀡󰀢r2(Y)={t(XY)|t(X)∈r1(X)andt(Y)∈r2(Y)}.

(2)

LetX,Y,ZbedisjointsubsetsofattributesinR.Wesayrelationr(R)satis-fiestheembeddedmultivalueddependency(EMVD)X→→Y|Z,iftheprojectionπXYZ(r)ofr(R)satisfiesthecondition:

πXYZ(r)=πXY(r)󰀡󰀢πXZ(r).

Example2.Thetraditonalrelationr(ABCD)ontheleftofFig.1satisfiestheEMVDB→→A|C,sinceπABC(r)=πAB(r)󰀡󰀢πBC(r).

2.2ProbabilisticConditionalIndependence

Herewebrieflyreviewthenotionofprobabilisticconditionalindependence,sub-classesofwhichplayaninstrumentalroleindesigningtheschemaofaprobabilis-ticnetwork[19].Thereadershouldnotethatweexpressprobabilisticconditionalindependenceintermsofourprobabilisticdatabasemodel[23–25]whichservesasaunifiedapproachforbothBayesiannetworksandrelationaldatabases.

LetR={v1,v2,...,vm}beafinitesetofvariables.Eachvariablevihasafinitedomain,denoteddom(vi),representingthevaluesthatvicantakeon.ForasubsetX={vi,...,vj}ofR,wewritedom(X)fortheCartesianproductofthedomainsoftheindividualvariablesinX,namely,dom(X)=dom(vi)×...×dom(vj).Eachelementx∈dom(X)iscalledaconfigurationofX.

Ajointprobabilitydistribution[20]ondom(R)isafunctionpondom(R)suchthatthefollowingtwoconditionsbothhold:(i)0≤p(t)≤1,foreach󰀁

configurationt∈dom(R),and(ii)t∈dom(R)p(t)=1.0.Apotentialondom(R)isafunctionφondom(R)suchthatthefollowingtwoconditionsbothhold:(i)0≤φ(t),foreachconfigurationt∈dom(R),and(ii)φ(t)>0,foratleastoneconfigurationt∈dom(R).Forbrevity,werefertoφasapotentialonRratherthandom(R),andwecallR,notdom(R),itsdomain[20].

Wenowintroducethefundamentalnotionofprobabilisticconditionalinde-pendency(CI)[24].LetX,YandZbedisjointsubsetsofvariablesinR.Letx,y,andzdenotearbitraryvaluesofX,YandZ,respectively.WesayYandZareconditionallyindependentgivenXunderthejointprobabilitydistributionp,denotedIp(Y,X,Z),if

p(y|x,z)=p(y|x),

(3)

wheneverp(x,z)>0.ThisconditionalindependencyIp(Y,X,Z)canbeequiva-lentlywrittenas

p(y,x,z)=

p(y,x)·p(x,z)

Example3.Oneprobabilisticrelationr(ABCD)isdepictedinthetopleftofFig.2.

ABCDAp(ABCD)

r(ABCD)=

001110000101011

0.20.20.10.10.4

ABC

=

001110000101011

Ap(AB)p(BC)

(0.4)(0.3)/(0.6)(0.4)(0.3)/(0.6)(0.2)(0.3)/(0.6)(0.2)(0.3)/(0.6)(0.4)(0.4)/(0.4)

=0.2=0.2=0.1=0.1=0.4

Fig.2.Theprobabilisticrelationr(ABCD)satisfiestheCIB⇒⇒A|C,sinceτABC(r)=τAB(r)⊗τBC(r).

Theprojectionoperatorπinrelationaldatabasescorrespondstothemarginal-izationoperatorτinprobabilisticdatabases.Whereasπignoresduplicatetuples,τaddsthem.

Example4.Giventheprobabilisticrelationr(ABCD)inthetopleftofFig.2,themarginalizationτABC(r)ofr(ABCD)ontovariablesABCisshowninthetoprightofFig.2.

Probabilisticconditionalindependencyisnowdefinedinourprobabilisticdatabasemodel.Aprobabilisticrelationr(XYZW)satisfiestheCIX⇒⇒Y|Z,if

τXYZ(r)=τXY(r)⊗τXZ(r),

wheretheoperator⊗correspondstotherightsideofEquation(4).

Example5.Theprobabilisticrelationr(ABCD)onthetopleftofFig.2satisfiestheCIB⇒⇒A|C,sincethemarginalτABC(r)canbewrittenasτABC(r)=τAB(r)⊗τBC(r).2.3

LogicalImplicationofCIandEMVD

(5)

Beforewestudytheimplicationproblemindetail,letusfirstintroducesomebasicnotions.Herewewillusethetermsrelationandjointprobabilitydistributioninterchangeably.

󰀁󰀁LetbeasetofdependenciesdefinedonasetofattributesR.BySATR(),wedenotethesetof󰀁allrelations󰀁onRthatsatisfyallofthedependenciesin󰀁

.WewriteSATR()asSAT()whenRisunderstood,andSAT(σ)for󰀁SAT({σ}),where󰀁σisasingledependency.Wesaylogicallyimpliesσ,written󰀁󰀁|=σ,ifSAT()⊆SAT(σ).Inotherwords,σislogicallyimpliedby,󰀁ifeveryrelationwhichsatisfiesalsosatisfiesσ.Thatis,thereisnocounter-󰀁examplerelationsuchthatallofthedependenciesinaresatisfiedbutσisnot.󰀁

Theimplicationproblemistotestwhetheragivensetofdependencieslogicallyimpliesanotherdependencyσ,namely,

󰀂

|=σ.(6)Sinceweadvocatethatourprobabilisticdatabasemodelisageneralization

oftherelationaldatabasemodel,animmediatequestiontoansweris:

Dotheimplicationproblemscoincideinthesetwodatabasemodels?Thatis,wewouldliketoknowwhetherthefollowingpropositionholds:

C|=c⇐⇒C|=c,

(7)

givensetsCandcofCIsandcorrespondingsetsCandcofEMVDs.

Studeny[22]arguedthattheBayesiannetworkcommunitycouldnottakeadvantageofworkintherelationaldatabasecommunitybyshowingthatEqua-tion(7)doesnotalwayshold.Inparticular,hepresentedthefollowingexampleviolatingEquation(7).

Example6.ConsiderthefollowingsetsCandcofCIs,andcorrespondingsetsCandcofEMVDs:

C={A3A4⇒⇒A1|A2,A1⇒⇒A3|A4,A2⇒⇒A3|A4,∅⇒⇒A1|A2},C={A3A4→→A1|A2,A1→→A3|A4,A2→→A3|A4,∅→→A1|A2},c=∅⇒⇒A3|A4,c=∅→→A3|A4.Then

C|=c,

yet

C|=c,

(9)(8)

StudenyprovedEquation(8)in[21].ToshowEquation(9),Studeny[22]presentedtherelationr(A1A2A3A4)inFig.3.Itcanbeverifiedthatrelationr(A1A2A3A4)satisfiesalltheEMVDsinCbutdoesnotsatisfytheEMVDc.

A1A2A3A4

r(A1A2A3A4)=

0001110010110000010100000.100.100.200.200.200.20

Fig.4.ForExample6,aprobabilisticrelationr(R)notsatisfyingC.Thetraditionalrelationr(R)obtainedfromr(R)satisfiesthecorrespondingC,butnotc,ofEMVDs.

Givenatraditionalrelationr(R),aprobabilisticrelationr(R)isobtainablefromr(R),ifthefollowingtwoconditionshold:

(i)r(R)=πR(r),and

(ii)theprobabilityvaluesofr(R)sumtoone.

Example8.Giventhetraditionalrelationr(R)inFig.3,oneobtainableproba-bilisticrelationr(R)isdepictedinFig.4.Onthecontrary,neitheroftheproba-bilisticrelationsinFig.5areobtainablefromr(R)inFig.3,sincecondition(i)isviolatedinbothcases.

11111100.1010.10

Fig.5.Neitherofthesetwoprobabilisticrelationsareobtainablefromthetraditionalrelationr(R)inFig.3.

ThenextresultdiscussesarelationshipbetweenEMVDandCI.

Lemma9.[17]Ifaprobabilisticrelationr(R)satisfiesaCIX⇒⇒Y|Z,thenthetraditionalrelationr(R)obtainablefromr(R)necessarilysatisfiestheEMVDX→→Y|Z.

Lemma9indicatesthatEMVDisanecessaryconditionforCI.

4PracticalIrrelevanceofDivergingImplication

Inthissection,weemphasizetheintrinsicrelationshipbetweenBayesiannet-worksandrelationaldatabasesbyshowingthatthetheoreticaldifferenceof

C|=c

and

C|=c

hasnopracticalconsequence.

Consideranyinstanceofthistheoreticaldifference.Alltraditionalrelationsr(R)onRcanbepartitionedintotwoclasses:(1)thoseagreeingwithC|=c,(2)thosedisagreeingwithC|=c.

Thenextresultshowsthatr(R)doesnotsatisfyC,foranyprobabilisticrelationr(R)obtainablefromanytraditionalrelationr(R)inclass(2).Theorem10.SupposeC|=candC|=cforsomeinstantiationoftheimplica-tionproblem.Foranytraditionalrelationr(R)satisfyingCbutnotc,consideranyprobabilisticrelationr(R)obtainablefromr(R).Thenr(R)∈SATR(C).Proof.Bycontradiction,supposeaprobabilisticrelationr(R)satisfiesalltheCIsinC,wherer(R)wasobtainedfromatraditionalrelationr(R)satisfyingCbutnotc.SinceC|=c,r(R)alsosatisfiesc.ByLemma9,asr(R)wasobtainedfromr(R),thetraditionalrelationr(R)satisfiesbothCandc.Acontradiction.Therefore,allprobabilsticrelationsr(R)satisfyingCmustbeobtainablefromtraditionalrelationsr(R)satisfyingbothCandc.

Theorem10isimportantsinceitindicatesthattheonlyprobabilisticrela-tionsr(R)satisfyingCmustbeobtainedfromthosetraditionalrelationsr(R)inclass(1),i.e.,thoser(R)satisfyingbothCandc.

Example11.Theprobabilisticrelationr(R)illustratedinFig.6satisfiesthegivenCIsCinExample6.Inaddition,itcanbeverifiedthatthetraditionalrelationr(R)obtainedfromr(R)satisfiestheEMVDsCandcinExample6.

A1A2A3A4

r(A1A2A3A4)=

Ap

Example12.Recallthetraditionrelationr(R)inFig.3thatsatisfiesCbutnotc.Oneprobabilisticrelationr(R),obtainablefromr(R),isillustratedinFig.4.Itcanbeverifiedthatr(R)doesnotsatisfyallCIsinC.Bydefinition,

r(R)∈SAT(C).

(11)

Thus,theprobabilisticrelationr(R)inFig.4isnotofinterestwhendesigningaprobabilisticexpertsystemfortheCIsC.

OurgoalistodesigntheschemaofaprobabilisticexpertsystemforagivensetCofCIs.Studeny[22]hassuggestedthatworkondatabasedesignisofnovaluetotheBayesiannetworkcommunity,sinceitmaybethecasethat

C|=c,

whileforthecorrespondingEMVDs

C|=c.

(13)(12)

Thekeypoint,asillustratedinFig.6,isthatwheneveraprobabilisticrelationsatisfiesC,theuniqueobtainabletraditionalrelationr(R)mustsatisfythecor-respondingEMVDsCandc.Onthecontrary,asdepictedinFig.4,r(R)willnotsatisfyC,foranyprobabilisticrelationr(R)obtainablefromatraditionalrelationr(R)satisfyingCbutnotc.Sincer(R)∈SATR(C),itisofnointerestwhendesigningtheprobabilisticexpertsystem.Ourmainresult,Theorem10,ensuresthatthesameresultholdsforallinstancesofEquations(12)and(13).

5Conclusions

InschemadesignofaprobabilisticnetworkonvariablesR,onlyprobabilitydis-tributionsr(R)satisfyingthegivenprobabilisticconditionalindependenciesCareofinterest,thatis,thoser(R)∈SATR(C).Similarly,inschemadesignofarelationaldatabaseonattributesR,onlytraditionalrelationsr(R)satisfyingthegivenEMVDsCareofinterest.Ithasbeenargued[22]thatBayesiannetworksaredifferentfromrelationaldatabasesbyshowinganinstancewhereC|=candC|=c,forcorrespondingsetsofCIsandEMVDs.Onthecontrary,ourmainre-sult(Theorem10)indicatesthatr(R)∈SATR(C),foranyprobabilisticrelationobtainedfromanytraditionalrelationalr(R)satisfyingCbutnotc.Therefore,thefactthatC|=chasnopracticalconsequenceonthedesignofaprobabilisticnetworksatisfyingCIsC.Theworkhereisthenyetonemoreexample[23–25]emphasizingtheintrinsicrelationshipbetweenBayesiannetworksandrelationaldatabases.

References

1.Abiteboul,S.,Hull,R.andVianu,V.:FoundationsofDatabases.Addison-Wesley,DonMills(1995)

2.Bronstein,A.,Das,J.,Duro,M.,Friedrich,R.,Kleyner,G.,Mueller,M.andSing-hal,S.:http://www.hpl.hp.com/techreports/2001/HPL-2001-23R1.pdf,April25,2005

3.Castillo,E.,Guti´errez,J.Hadi,A.:ExpertSystemsandProbabilisticNetworkModels.Springer,NewYork(1997)

4.Heckerman,D.,Horvitz,E.andNathwani,B.:Towardsnormativeexpertsystems:PartIthePathfinderproject.MethodsofInformationinMedicine31(2)(1992)90–105

5.Horvitz,E.:AgentswithBeliefs:ReflectionsonBayesianMethodsforUserMod-elling.In:ProceedingsoftheSixthInternationalConferenceonUserModelling(1997)441–442

6.Horvitz,E.andBarry,E.M.:DisplayofInformationforTimeCriticalDecisionMaking.In:ProceedingsofEleventhConferenceonUncertaintyinArtificialIntel-ligence.MorganKaufmann,SanFrancisco(1995)296-305

7.Horvitz,E.,Breese,J.,Heckerman,D.,Hovel,D.andRommelse,K.:TheLumiereProject:BayesianUserModelingforInferringtheGoalsandNeedsofSoftwareUsers.In:ProceedingsoftheFourteenthConferenceonUncertaintyinArtificialIntelligence.Madison,WI(1998)256-265

8.Horvitz,E.,Srinivas,S.,Rouokangas,C.andBarry,M.:Adecision-theoreticap-proachtothedisplayofinformationfortime-criticaldecisions:TheVistaproject.In:ProceedingsofSOAR-92ConferenceonSpaceOperationsAutomationandResearch,NationalAeronauticsandSpaceAdministration(1992)

9.Jensen,F.V.:AnIntroductiontoBayesianNetworks.UCLPress,London(1996)10.Jensen,F.V.,Lauritzen,S.L.andOlesen,K.G.:Bayesianupdatingincausalproba-bilisticnetworksbylocalcomputations.Comput.Stat.Quarterly4(1990)269–28211.http://www.nokia.com/nokia/0,,53720,00.html,April25,2005

12.Lauritzen,S.L.andSpiegelhalter,D.J.:Localcomputationswithprobabilitieson

graphicalstructuresandtheirapplicationtoexpertsystems.J.Roy.Statist.Soc.B.50(2)(1988)157–24413.Lumi`ereProject:BayesianReasoningforAutomatedAssistance.

http://research.microsoft.com/∼horvitz/lum.htm,April27,2005

14.Lockheed,LockheedMartinAutonomousControlLogictoGuideUnmannedUn-derwaterVehicle,PressRelease,LockheedMartinMissilesandSpaceCommuni-cationsOffice,

http://lmms.external.lmco.com/newsbureau/pressreleases/1996/9604.html,April17,1996

15.Long,W.:Medicaldiagnosisusingaprobabilisticcausalnetwork.AppliedArtificial

Intelligence3(1989)367–383

16.Skaanning,C.,Jensen,F.V.,Kjaerulff,U.,Parker,L.,Pelletier,P.andRostrup-Jensen,L.:http://www.cs.auc.dk/research/DSS/papers/skaanning98a.doc,April25,2005

17.Lee,T.T.:Aninformation-theoreticanalysisofrelationaldatabases-PartI:data

dependenciesandinformationmetric.IEEETransactionsonSoftwareEngineering.SE-13(10)(1987)1049–1061

18.Maier,D.:TheTheoryofRelationalDatabases.PrinciplesofComputerScience.

ComputerSciencePress,Rockville,Maryland(1983)

19.Pearl,J.:ProbabilisticReasoninginIntelligentSystems:NetworksofPlausible

Inference.MorganKaufmannPublishers,SanFrancisco(1988)

20.Shafer,G.:ProbabilisticExpertSystems.SocietyfortheInstituteandApplied

Mathematics,Philadelphia(1996)

21.Studeny,M.:Multiinformationandtheproblemofcharacterizationofconditional-independencerelations.ProblemsofControlandInformationTheory.18(1)(1989)3–16

22.Studeny,M.:Conditionalindependencerelationshavenofinitecompletecharacter-ization.In:ProceedingsoftheEleventhPragueConferenceonInformationTheory,StatisticalDecisionFoundationandRandomProcesses,(1990)377–396

23.Wong,S.K.M.andButz,C.J.:Constructingthedependencystructureofamulti-agentprobabilisticnetwork.IEEETrans.Knowl.DataEng.13(3)(2001)395–41524.Wong,S.K.M.,Butz,C.J.andWu,D.:Ontheimplicationproblemforprobabilistic

conditionalindependency.IEEETrans.Syst.ManCybern.SMC-A30(6)(2000)785–805

25.Wong,S.K.M.,Butz,C.J.andXiang,Y.:Amethodforimplementingaprobabilis-ticmodelasarelationaldatabase.In:ProceedingsoftheEleventhConferenceonUncertaintyinArtificialIntelligence,Montreal,QC(1995)556–564

26.Xiang,Y.:ProbabilisticReasoninginMultiagentSystems:Agraphicalmodels

approach,CambridgeUniversityPress,NewYork(2002)

因篇幅问题不能全部显示,请点此查看更多更全内容

Top