Real-timeUpperBody3DPoseEstimationfromaSingle
UncalibratedCamera
AntonioS.Micilotta
EngJonOng
RichardBowden
CVSSP,UniversityofSurrey,Guildford,UK
Abstract
Thispaperoutlinesamethodofestimatingthe3Dposeoftheupperhumanbodyfromasingleuncalibratedcamera.Theobjectiveapplicationliesin3DHumanComputerInteractionwherehanddepthinformationoffersextendedfunctionalitywheninteractingwitha3Dvirtualenvironment,butitisequallysuitabletoanimationandmotioncapture.Adatabaseof3Dbodyconfigurationsisbuiltfromavarietyofhumanmovementsusingmotioncapturedata.Ahierarchicalstructureconsistingofthreesubsidiarydatabases,namelythefrontal-viewHandPosition(top-level),SilhouetteandEdgeMapDatabases,arepre-extractedfromthe3Dbodyconfigurationdata-base.Usingthishierarchy,subsetsofthesubsidiarydatabasesarethenmatchedtothesubjectinreal-time.Theexamplesofthesubsidiarydatabasesthatyieldthehighestmatchingscoreareusedtoextractthecorresponding3Dconfigurationfromthemotioncapturedata,therebyestimatingtheupperbody3Dpose.
CategoriesandSubjectDescriptors(accordingtoACMCCS):I.4.8[ImageProcessingandComputerVision]:SceneAnalysis
1.Introduction2.Dataacquisition
Usinga3Dgraphicspackage,askeletonisskinnedwithagenerichumanmesh(Figure4(b))toresembleapersonwearingloosefittingclothing.Themeshmaterialisassignedan‘Ink’nPaint’materialwithonelevelofcoloursothattherenderedmodelhasaclean‘cellshaded’effect.Arenderedmodelwithonecolourlevelresemblesasimplesilhouetteastheoutlineofthearmsisnotvisiblewhenmovinginfrontofthetorso.Wethereforecolourtherespectivebodypartsindependentlytopreservetheseedges.Theheadbodypartextendsfromthetopoftheheadtothebottomoftheneck,andiscomparabletothevisibleupperbodyskintoneoftheuser(fromthehairlinetothecollaroftheshirt).Theleftandrighthandsarecolouredblueandyellowrespectively,therebyprovidingindependentlabelling.Thematerialfromthewaistdownistransparentandtherenderedmodelthere-foreconsistsofamulti-colouredupperbodyagainstablackbackground(seeFigure1(a)).
Asingletargetcamera(acamerawherebythecamera-to-targetdistanceremainsfixed)isthenattachedtothechestboneoftheskeleton,andisallowedtorollinaccordancewithit.Theskeletonisthenanimatedusingavarietyofmo-
Humananimationcanbedonelaboriouslyviakeyframingorviamotioncapturewhichcanbeexpensive.Theabilitytoanimatedirectlyfromvideowouldbeabeneficialtoolwithapplicationsinmanyareassuchas3Dbroadcasting,games,HCIandanimation.
Statisticalmethodsofreconstructingthe3Dposefromamonocularsequencetrackmultiplebodypointsandcomputepriorprobabilitiesof3Dmotionswiththeaidoftrainingdata[BM00,HLF00].Sidenbladh[Sid01]employedstrongmotionpriorsinaparticlefilterframeworktoovercomevi-sualambiguityandpresentedatrackedwalkinghumaninamonocularimagesequence.Thematchingofshapeandedgetemplateshasalsoreceivedattentioninhandposeesti-mation[STTC04]whereshapematchingfollowsacascadedapproachtoreducethenumberofedgetemplatecompar-isons.Weapplyasimilarmethodtoreconstructtheupperhumanbody,andusehandpositionstoinitiallyextractcor-respondingsilhouettes.
cTheEurographicsAssociation2005.
A.S.Micilotta,E.J.OngandR.Bowden/Real-timeUpperBody3DPoseEstimationfromaSingleUncalibratedCamera
(a)(b)
(b)Boundaryimage
(c)
(c)Edgemap
Figure1:(a)Frontal2Drepresentationof3Dmodel
tioncapturedatatoproduceadatabaseof3Dbodyconfig-urations.Thissequence,consistingof5000frames,isren-deredfromthiscameraview,andyieldsadatabaseof2Dfrontalviewimages(FrontalViewDatabase)ofanuprightupperbodythathasafixedscale,andiscentredatpositionP(Figure1(a)).
2.1.Subsidiarydatasets
TheimagesoftheFrontalViewDatabasearethenusedto
produceahierarchyofthreesubsidiarydatabases.Thesearecomputedoff-line,andareloadedwhentheapplicationisexecuted.Fromparentdown:
1.HandPositionDatabase.Thisconsistsofthe2Dposi-tionsoftheleftandrighthandsthatareobtainedbydeter-miningthecentroidoftheblueandyellow(hand)regionsofeachframe.
2.SilhouetteDatabase.Thisiseasytocreateastheback-groundofeachexampleisblack.However,duetothesizeofthedataset,storingasilhouetteimageforeachframeisunrealisticastheentiredatasetoccupiesseveralGiga-bytesinrawformat.Itismoreefficienttorepresenteachsilhouetteimageintermsofitsboundary,asshowninFigure1(b)andisstoredasentryandexitpairsforeachrowofthesilhouette.Thisrepresentationnotonlymin-imisesRAMrequirements,butoffersafastandefficientmethodofcomparisontotheinputsilhouette,whichisrepresentedasanintegralimage(seeSection3.5).
3.EdgeMapDatabase.Conductinganedgedetectiononthecellshadedandmulti-colouredmodelprovidescleanedgeimages(Figure1(c)).Again,toconserveRAM,onlytheedgelocationsarestored.AllexamplesinthesedatabasesareindexedaccordingtotheFrontalViewDatabase,andhencethe3Dbodyconfigu-rationdatabasethatgeneratedit.
3.Modelmatching
Thesectionsbelowdiscusstheprocessesthatoccuratrun-time,afterthesubsidiarydatabaseshavebeenloaded.
3.1.Backgroundsuppression
Inthispaper,theinputimagereferstotheimagecapturedfromthecameraatruntime,andconsistsofasubject(oruser)facingthecamerawithaclutteredbackground.Seg-mentingtheuserfromtheinputimageplaysanimportantroleintrackingthevariousbodyparts,andinmatchinga3Dmodel.Asimplesolutionwouldbetouseabluescreenbackgroundwherechromakeyingcanbeperformed.How-ever,suchacontrolledenvironmentislimiting,andwethere-foremakeuseofabackgroundsuppressionalgorithmthatcanisolateauserfromaclutteredbackground.Ouralgo-rithmwasoriginallydevelopedforexteriorvisualsurveil-lanceandreliesuponmodellingthecolourdistributionwithaGaussianmixturemodelonaperpixelbasis.Thismodelislearnedinanonlinefashionusinganiterativeapproxima-tiontoexpectationmaximisation–oncethebackgroundhasbeenlearned,suddenchangesinpixelintensityareassoci-atedwithforegroundmovement.Backgroundisrepresentedby‘0’,andforegroundby‘1’.3.2.Trackingtheuser
Inorderfortheentiresystemtoruninreal-time,werequirearobustmethodtotracktheuser’storso,faceandhands.Us-ingthesegmentedimage,wemakeuseofarobusttrackingalgorithmthatusesacoarseestimatetobodyshapetotrackthetorso,andlearnsauser-specificskinmodeltotrackthefaceandthehands(seeFigure2(a)).Thereaderisdirectedto[MB04]forfullimplementationdetails.3.3.Inputimageadjustment
ReferringtoanexampleoftheFrontalViewDatabase(Fig-ure1(a)),thelengthfromthetopoftheheadtothenecklineH,isconstantacrossallexamples,andisusedastherefer-encepointwithwhichtoscaletheinputimage.PositionPandlengthHarepre-computed.
ComparingtheFrontalViewDatabaseanditssubsidiariestotheinputimagerequiresthattheinputimageforegroundexistsinsamespatialdomain(seeFigure2(b)).Todothis,theinputimageneckcentreIPandheadlengthIHmustbedetermined.ThetrackingsystemofSection3.2provides
cTheEurographicsAssociation2005.A.S.Micilotta,E.J.OngandR.Bowden/Real-timeUpperBody3DPoseEstimationfromaSingleUncalibratedCamera
(a)
Figure2:(a)Inputimage
(b)
(b)Adjustedinputimage
(c)
(c)Integralimage/boundaryoverlap
thepositionsanddimensionsofthetorsoandhands.IPisapproximatedtobethesameastheshoulderheight,andIHisthereforethelengthfromthetopoftheheadtoIP.ThescalefactorisdeterminedbyS=IH/H,andtheoff-setfromPtoIPisdeterminedbyoffset=P−IP/S.Theinputimageisscaledandtranslatedinasinglepass,creatingtheadjustedinputimage(AdjIm)ofFigure2(b):
∀x,yAdjIm(x,y)=inputImage(x,y)/S+offset;
(1)
WethenextractanadjustedinputsilhouetteISandedgemapfromthisadjustedinputimage.
3.4.Extractingsubsidiarydatabaseexamples
Beforeconductingsilhouettematching,weinitiallyextractasubsetoftheSilhouetteDatabasebyconsideringtheuser’shandpositions.Usingtheleftandrighthandboundingboxesprovidedbythetrackingalgorithmasreference,wesearchthroughtheHandPositionDatabaseforhandposi-tionsthataresimultaneouslycontainedbytheseboundingboxes,andextractthecorrespondingexamplesfromtheSil-houetteDatabase.Itislikelythatseveralpossibleexampleswillbeidentified;amatchingscoreisthereforecalculatedforeachexampleasperSection3.5.
3.5.Silhouettematchingusingintegralimages
WedetermineasetofmatchingscoresfortheSilhouetteDatabasesubsetbycomputingthepercentagepixeloverlapbetweentheISandeachexample.Acrudemethodwouldbetoreconstructasilhouetteimagefromtheboundarydata-base,andtoperformacomparisononaperpixelbasis.Thisisprohibitiveaseachexamplesilhouettecontainsapproxi-mately15000pixels–computingthismultipletimeswouldclearlylimitreal-timeperformance.Thematchingprocedureismademoreefficientbyusinganintermediaterepresenta-tionoftheinputsilhouetteIS,calledanintegralimageII.TheIIencodestheshapeoftheobjectbycomputingthesummationofpixelsonarowbyrowbasis.Thevalueofthe
cTheEurographicsAssociation2005.II(x,y)equalsthesumofallthenon-zeropixelstotheleftof,andincludingIS(x,y):
x
II(x,y)=
IS(i,y)di(2)
i=0
TheentireIIcanbecomputedinthismannerforall(x,y),howeverforefficiencywecomputethisincrementally:
∀x,yII(x,y)=IS(x,y)+II(x−1,y)
(3)
Figure2(c)offersavisualisationoftheIIoftheIS(ex-tractedfromFigure2(b)),withasilhouetteboundaryex-ampleoftheSilhouetteDatabasesuperimposed.Referring
toFigure2(c),thenumberofpixelsbetweenboundarypair(y,x1)to(y,x2)iscomputedasNB(y)=x2−x1+1.Thenumberofpixelsoftheinputsilhouetteforthecorre-spondingrangeisthereforecomputedasNIS(y)=II(y,x2)−II(y,x1)+1.∑NBand∑NISarecomputedforallbound-arypairs,andthematchingscoreisthereforecomputedasS=∑NIS/∑NB.Thisscoreiscomputedinafewhundredoperations;considerablylessthantensofthousandsofpixel-pixelcomparisons.
OncematchingscoresarecomputedfortheexamplesoftheSilhouetteDatabasesubset,thetop10%areusedtoex-tractasubsetoftheEdgeMapDatabase.3.6.Chamfermatchingandfinalselection
Poseswiththearmsdirectlyinfrontofthebodyproducesimilarsilhouettes,andwethereforealsoconsidertheedgeinformationtoresolveambiguities.Havingextractedasub-setoftheEdgeMapDatabase,wethencompareeachoftheseedgemapstothatoftheinputimagetocomputeasec-ondmatchingscore.
Ashumansvaryinphysique,itisunlikelythattheedgesoftheinputandtheexampleswilloverlapexactly.Wethere-foreapplyadistancetransform[FH04]totheinputedgeim-age(Figure3(a))to‘blur’theedges(Figure3(b)).Thedis-tancetransformspecifiesthedistanceofeachpixeltothe
A.S.Micilotta,E.J.OngandR.Bowden/Real-timeUpperBody3DPoseEstimationfromaSingleUncalibratedCamera
(a)
Figure3:(a)Edgeimage
(b)(b)Distanceimage
(c)Chamfermatching
(c)
nearestnon-zeroedge–thedarkerthepixel,thecloseritistoanedge.
Wethensuperimposetheexampleedgemaponthedis-tanceimage,anddeterminetheedgedistance–themeanofthedistanceimagepixelvaluesthatco-occurwithex-ampleedgemaps.Theexamplethatyieldstheshortestdis-tancerepresentsthebestmatch,andisusedtoaccessthe3Dbodyconfigurationfromtheoriginaldatabase.ThismethodofmatchingedgeimagesisreferredtoasChamfermatch-ing[BTBW77].4.Results
Figure4(a)showsatrackedsubjectinvariousscenes.ArepresentativeCGmodel,correspondingtothebestsilhou-etteandedgematch,isshowninFigure4(b).Themodelillustratedhereisthatusedfortheexampledatabaseandcanbeeasilyreplacedwithanothermodel.Thesystemrunsat16frames/secandisinvarianttotheuser’sscaleandposition.5.Conclusion
Wehavebeensuccessfulinmatchingacorresponding3Dmodeltoasubject.The3DhandpositionscanbeextractedforHCI,ortheCGmodelitselfcouldbeusedforanimationpurposes.Matchingbyexampledoeshoweverrequirealargeexampledataset,andwehavethereforestoredourdatasetsintheirsimplestforms.Notonlycanthesesimplerepresen-tationsbeaccessedquickly,buttheyalsocontributetothefastmatchingmethodsemployed.Furthermore,thehierar-chicalstructurerestrictsanalysistosubsetsofthesubsidiarydatabases,therebycontributingtothereal-timeaspectoftheapproach.References
[BM00]BOWDENR.,MITCHELLT.:Non-linearstatisti-calmodelsforthe3dreconstr.ofhumanpose.InImageandVisionComputing(2000),vol.18,pp.729–737.[BTBW77]BARROWH.,TENENBAUMJ.,BOLLESR.,WOLFH.:Parametriccorrespondenceandchamfermatching:Twonewtechniquesforimagematching.InProc.ofJointConf.AI(1977),pp.659–663.
(a)
(b)
Figure4:Frontalposewithcorresponding3Dmodel
[FH04]FELZENSZWALBP.,HURRENLOCHERD.:Dis-tanceTransformsofSampledFunctions.Tech.Rep.TR2004-1963,CornellComputing,2004.[HLF00]HOWEN.,LEVENTONM.,FREEMANW.:Bayesianreconstructionof3dhumanmotionfromsinglecameravideo.InNIPS(2000),vol.12,pp.820–826.[MB04]MICILOTTAA.,BOWDENR.:View-basedloca-tionandtrackingofbodypartsforvisualinteraction.InProc.ofBMVC(2004),vol.2,pp.849–858.[Sid01]SIDENBLADHH.:ProbabilisticTrackingandRe-constructionof3DHumanMotion.PhDthesis,RoyalInstituteofTechnology,CVAPL,Nov2001.[STTC04]STENGERB.,THAYANANTHANA.,TORRP.,CIPOLLAR.:Handposeestimationusinghierarchicaldetection.InWorkshoponHCI(2004),pp.105–116.
cTheEurographicsAssociation2005.
因篇幅问题不能全部显示,请点此查看更多更全内容
Copyright © 2019- igbc.cn 版权所有 湘ICP备2023023988号-5
违法及侵权请联系:TEL:199 1889 7713 E-MAIL:2724546146@qq.com
本站由北京市万商天勤律师事务所王兴未律师提供法律服务