DispositionProcessors

  • 실행할 DispositionChain의 Processor collection 리스트

기존 설정 값

 <!-- now, processors are assembled into ordered DispositionChain bean -->
 <bean id="dispositionProcessors" class="org.archive.modules.DispositionChain">
  <property name="processors">
   <list>
    <!-- write to aggregate archival files... -->
    <ref bean="warcWriter"/>
    <!-- ...send each outlink candidate URI to CandidateChain, 
         and enqueue those ACCEPTed to the frontier... -->
    <ref bean="candidates"/>
    <!-- ...then update stats, shared-structures, frontier decisions -->
    <ref bean="disposition"/>
    <!-- <ref bean="rescheduler" /> -->
   </list>
  </property>
 </bean>
  • [process 진행 과정 설명]
    1. crawling 된 data를 warc 파일로 저장한다. -> warcWriter
    2. 발견된 outlink들을 CandidateChain으로 보내고 queue에 ACCEPT 형식으로 push한다. -> candidates
    3. 하나의 routine이 끝났으므로 상태를 재정비함 -> disposition


Posted by Righ
,