[Heritrix/crawler-beans.cxml]CANDIDATE CHAIN-candidateProcessors
212.Heritrix_설정파일/05. PROCESSING CHAINS 2016. 7. 27. 18:01CandidateProcessors
- 포함할 CandidateChain에 포함할 processor들을 표시한다.
기존 설정 값
<!-- now, processors are assembled into ordered CandidateChain bean -->
<bean id="candidateProcessors" class="org.archive.modules.CandidateChain">
<property name="processors">
<list>
<!-- apply scoping rules to each individual candidate URI... -->
<ref bean="candidateScoper"/>
<!-- ...then prepare those ACCEPTed to be enqueued to frontier. -->
<ref bean="preparer"/>
</list>
</property>
</bean>
- [process 진행 과정 설명]
- 각각의 candidate URI에 rule을 적용하여 filtering 한다. -> cadidateScoper
- 이 filtering 된 URI들(ACCEPTed)을 queue에 저장한다. -> preparer
'212.Heritrix_설정파일 > 05. PROCESSING CHAINS' 카테고리의 다른 글
[Heritrix/crawler-beans.cxml]FETCH CHAIN-fetchDns (0) | 2016.08.01 |
---|---|
[Heritrix/crawler-beans.cxml]FETCH CHAIN-preconditions (0) | 2016.08.01 |
[Heritrix/crawler-beans.cxml]FETCH CHAIN-preselector (0) | 2016.08.01 |
[Heritrix/crawler-beans.cxml]CANDIDATE CHAIN-preparer (0) | 2016.07.27 |
[Heritrix/crawler-beans.cxml]CANDIDATE CHAIN-candidateScoper (0) | 2016.07.26 |