|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object fr.gouv.culture.util.apache.avalon.excalibur.concurrent.Semaphore fr.gouv.culture.util.apache.avalon.excalibur.concurrent.Mutex fr.gouv.culture.util.apache.cocoon.xml.AbstractSynchronizedXMLProducer fr.gouv.culture.util.apache.cocoon.xml.AbstractSynchronizedXMLPipe fr.gouv.culture.oai.SynchronizedOAIObjectImpl fr.gouv.culture.oai.AbstractOAIHarvester
public abstract class AbstractOAIHarvester
Nested Class Summary |
---|
Nested classes/interfaces inherited from interface fr.gouv.culture.oai.OAIObject |
---|
OAIObject.Node |
Field Summary | |
---|---|
protected java.lang.String[] |
adminEmails
List of email address strings for administrators of this harvester |
protected boolean |
captureElemContent
flag for sax event handling indicating that an element's content should be captured in the endElement method |
protected boolean |
captureRecord
flag for sax event handling indicating that a record should be capture |
protected java.lang.String |
currentDatestamp
The _datestamp for the current record from the stream |
protected java.lang.String |
currentMetadtaUrlIdentifier
Variable to hold any value retrieved based on |
protected java.lang.String |
currentOaiIdentifier
The oai identifier for the current record from the stream |
protected java.lang.String |
currentOaiStatus
The oai status for the current record from the stream |
protected int |
cursor
Variable to hold cursor information from a request using resumptionTokens to return an entire set by multiple parts |
protected boolean |
deleteRecord
flag for sax event handling indicating that a record should be delete |
protected java.lang.String |
errorCode
|
protected org.apache.cocoon.xml.XMLConsumer |
firstXmlConsumer
The first externally provided xml consumer. |
protected java.lang.String |
identifierName
if a identifier name is provided we will attempt to take the value of the element named as such outside of the OAI2.0 namespace and retrieve an underlying XML document assuming the value is a valid url identifier and will incorporate the XML content into the oai-record |
protected org.apache.avalon.framework.service.ServiceManager |
manager
Service manager for the object |
protected java.lang.String |
newRequestUrl
The new URL to resolve the next resumptionToken |
static java.lang.String |
OAI_REPOSITORY_URL
|
static java.lang.String |
OAI_REQUEST_URL
|
protected java.lang.String |
repoUrl
Variable to hold the url of the repository from which a response is being received |
protected org.apache.avalon.framework.parameters.Parameters |
requestParams
The parameters for the request sent for which a response is being received |
protected java.lang.String |
requestUrl
Variable to hold the url of the request to be sent |
protected java.lang.String |
responseDate
Variable to hold the _datestamp of the response of the repository from which a response is being received |
protected java.lang.String |
resumptionToken
Variable to hold the resumptionToken of the response of the repository from which a response is being received |
protected java.lang.StringBuffer |
sBuff
buffer for data collection from sax stream |
protected java.lang.String |
userAgent
User agent value to send with request |
Fields inherited from class fr.gouv.culture.oai.SynchronizedOAIObjectImpl |
---|
_context, logger |
Fields inherited from class fr.gouv.culture.util.apache.cocoon.xml.AbstractSynchronizedXMLProducer |
---|
synchronizedXmlConsumer |
Fields inherited from interface fr.gouv.culture.oai.OAIObject |
---|
HTTP_HEADER_NAME_FROM, HTTP_HEADER_NAME_USER_AGENT, NUMBER_RECORDS_PER_RESPONSE, STRING_DATEFORMAT_GRANULARITY_DAY, STRING_DATEFORMAT_GRANULARITY_SECOND |
Fields inherited from interface EDU.oswego.cs.dl.util.concurrent.Sync |
---|
ONE_CENTURY, ONE_DAY, ONE_HOUR, ONE_MINUTE, ONE_SECOND, ONE_WEEK, ONE_YEAR |
Constructor Summary | |
---|---|
AbstractOAIHarvester()
|
Method Summary | |
---|---|
protected void |
abortRecordCapture()
Stops any record capture currently being executed and sends a flag to the called method telling it to delete any document saved to any media |
protected abstract void |
captureRecord()
When a complete record is received, this method takes the necessary steps to save the record to any underlying media, or pre-media |
protected abstract void |
captureResourceFromUrlIdentifier()
When a complete "underlying document" is received, this method takes the necessary steps to save the document to any underlying media, or pre-media |
void |
characters(char[] chars,
int relation,
int relation1)
Receive notification of character data. |
void |
close()
Close OAI harvester. |
void |
endElement(java.lang.String s,
java.lang.String s1,
java.lang.String s2)
Receive notification of the end of an element. |
java.lang.String[] |
getAdminEmails()
Retrieves the list of adminstrator email addresses |
protected org.apache.avalon.framework.parameters.Parameters |
getHarvestParameters()
This method returns the parameters for the request sent by this harvester as well as the "repository url", "request url", and the "harvester admin email". |
protected void |
handleErrors(java.lang.String errorMsg)
Logs error messages, and the request parameters for that were sent to the repository which may have caused the error state |
protected abstract void |
handleResumptionToken()
This method handles and reissues a new request using any resumption token received |
protected abstract void |
prepareRecordCapture()
Prepares resources for capturing an oai record |
protected abstract void |
prepareRecordForDeletion()
After receiving a header@status="deleted" for a record, this method makes the necessary preparations to delete the record from the harvester |
protected abstract void |
prepareResourceFromUrlIdentifierCapture()
Prepares resources for capturing the underlying document available via a url described by the oai record |
void |
receiveRequest(java.lang.String url)
Internal receive request method that by passes synchronization of this object as it may have already been synchronized elsewhere in the processing. |
void |
receiveSynchronizedRequest(java.lang.String url)
Receive an OAI request |
void |
receiveSynchronizedRequest(java.lang.String url,
java.lang.String originalRequestUrl)
Receive an OAI request as an URL. |
void |
recycle()
Clears any consumers provided to this object |
protected void |
resetAllFields()
Resets the necessary class fields |
protected abstract void |
resetRecordCaptureFields(boolean deleteDoc)
Stops any record capture currently being executed, resets the corresponding class fields and potentially deletes any document saved to any media |
protected void |
resetResumptionToken()
|
protected abstract void |
saveCriticalFields(boolean dataHarvested)
If data has been harvested, this method saves the any/all details of the harvest |
void |
service(org.apache.avalon.framework.service.ServiceManager serviceManager)
The service manager for the object |
void |
setAdminEmails(java.lang.String[] adminEmails)
Establishes the list of adminstrator email addresses |
void |
setConsumer(org.apache.cocoon.xml.XMLConsumer consumer)
Set's the consumer of this object's events and will attempt to establish our firstXmlConsumer |
void |
setIdentifierName(java.lang.String name)
Establishes the identifier class field |
protected abstract boolean |
shouldHarvestDocument()
Querys underlying data structures do determine whether the current oai record should be harvested based on the state of the harvester (ie. past harvests, presence or lack or record in harvester data structures) |
void |
startElement(java.lang.String s,
java.lang.String s1,
java.lang.String s2,
org.xml.sax.Attributes attributes)
Receive notification of the beginning of an element. |
protected abstract void |
storeFailedHarvestData(java.lang.Exception e)
This method stores information about a failed (internal failure not external error from OAI repository) harvest request, so that the valid request may be reexecuted by the proper mechanism. |
protected abstract boolean |
storeHarvestedData()
This method saves all harvested records to a particular media |
void |
toSAX(org.xml.sax.ContentHandler contentHandler)
Currently does nothing |
Methods inherited from class fr.gouv.culture.oai.SynchronizedOAIObjectImpl |
---|
contextualize, enableLogging, getContext, sendElement, sendElementContent |
Methods inherited from class fr.gouv.culture.util.apache.cocoon.xml.AbstractSynchronizedXMLPipe |
---|
acquireSynchronizedXMLConsumer, comment, endCDATA, endDocument, endDTD, endEntity, endPrefixMapping, ignorableWhitespace, processingInstruction, releaseSynchronizedXMLConsumer, setDocumentLocator, skippedEntity, startCDATA, startDocument, startDTD, startEntity, startPrefixMapping |
Methods inherited from class fr.gouv.culture.util.apache.cocoon.xml.AbstractSynchronizedXMLProducer |
---|
setConsumer |
Methods inherited from class fr.gouv.culture.util.apache.avalon.excalibur.concurrent.Mutex |
---|
acquired, isAcquired |
Methods inherited from class fr.gouv.culture.util.apache.avalon.excalibur.concurrent.Semaphore |
---|
acquire, attempt, getTokens, release |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Methods inherited from interface fr.gouv.culture.oai.OAIHarvester |
---|
purgePastHarvestsData, sendPastHarvestsSummary, sendStoredHarvestingRequests |
Methods inherited from interface org.apache.avalon.framework.logger.LogEnabled |
---|
enableLogging |
Methods inherited from interface org.apache.avalon.framework.context.Contextualizable |
---|
contextualize |
Methods inherited from interface org.xml.sax.ContentHandler |
---|
endDocument, endPrefixMapping, ignorableWhitespace, processingInstruction, setDocumentLocator, skippedEntity, startDocument, startPrefixMapping |
Methods inherited from interface org.xml.sax.ext.LexicalHandler |
---|
comment, endCDATA, endDTD, endEntity, startCDATA, startDTD, startEntity |
Methods inherited from interface org.apache.avalon.framework.configuration.Configurable |
---|
configure |
Methods inherited from interface fr.gouv.culture.util.apache.cocoon.xml.SynchronizedXMLProducer |
---|
acquired, setConsumer |
Methods inherited from interface fr.gouv.culture.util.apache.cocoon.xml.SynchronizedXMLConsumer |
---|
acquired |
Methods inherited from interface EDU.oswego.cs.dl.util.concurrent.Sync |
---|
acquire, attempt, release |
Field Detail |
---|
public static final java.lang.String OAI_REQUEST_URL
public static final java.lang.String OAI_REPOSITORY_URL
protected org.apache.avalon.framework.service.ServiceManager manager
protected java.lang.String[] adminEmails
protected java.lang.String userAgent
protected java.lang.String requestUrl
protected java.lang.String newRequestUrl
protected org.apache.avalon.framework.parameters.Parameters requestParams
protected java.lang.StringBuffer sBuff
protected boolean captureElemContent
protected boolean captureRecord
protected boolean deleteRecord
protected java.lang.String repoUrl
protected java.lang.String responseDate
protected java.lang.String resumptionToken
protected int cursor
protected java.lang.String errorCode
protected java.lang.String currentOaiIdentifier
protected java.lang.String currentDatestamp
protected java.lang.String currentOaiStatus
protected java.lang.String identifierName
protected java.lang.String currentMetadtaUrlIdentifier
identifierName
protected org.apache.cocoon.xml.XMLConsumer firstXmlConsumer
Constructor Detail |
---|
public AbstractOAIHarvester()
Method Detail |
---|
public void service(org.apache.avalon.framework.service.ServiceManager serviceManager) throws org.apache.avalon.framework.service.ServiceException
service
in interface org.apache.avalon.framework.service.Serviceable
serviceManager
-
org.apache.avalon.framework.service.ServiceException
public void setConsumer(org.apache.cocoon.xml.XMLConsumer consumer)
firstXmlConsumer
setConsumer
in interface org.apache.cocoon.xml.XMLProducer
setConsumer
in class AbstractSynchronizedXMLProducer
consumer
- public java.lang.String[] getAdminEmails()
getAdminEmails
in interface OAIHarvester
public void setAdminEmails(java.lang.String[] adminEmails)
setAdminEmails
in interface OAIHarvester
public void setIdentifierName(java.lang.String name)
setIdentifierName
in interface OAIHarvester
name
- identifierName
public void toSAX(org.xml.sax.ContentHandler contentHandler) throws org.xml.sax.SAXException
toSAX
in interface org.apache.excalibur.xml.sax.XMLizable
contentHandler
-
org.xml.sax.SAXException
public void startElement(java.lang.String s, java.lang.String s1, java.lang.String s2, org.xml.sax.Attributes attributes) throws org.xml.sax.SAXException
AbstractSynchronizedXMLPipe
startElement
in interface org.xml.sax.ContentHandler
startElement
in class SynchronizedOAIObjectImpl
s
- The Namespace URI, or the empty string if the element has no
Namespace URI or if Namespace
processing is not being performed.s1
- The local name (without prefix), or the empty string if
Namespace processing is not being performed.s2
- The raw XML 1.0 name (with prefix), or the empty string if
raw names are not available.attributes
- The attributes attached to the element. If there are no
attributes, it shall be an empty Attributes object.
org.xml.sax.SAXException
public void characters(char[] chars, int relation, int relation1) throws org.xml.sax.SAXException
AbstractSynchronizedXMLPipe
characters
in interface org.xml.sax.ContentHandler
characters
in class AbstractSynchronizedXMLPipe
chars
- The characters from the XML document.relation
- The start position in the array.relation1
- The number of characters to read from the array.
org.xml.sax.SAXException
public void endElement(java.lang.String s, java.lang.String s1, java.lang.String s2) throws org.xml.sax.SAXException
AbstractSynchronizedXMLPipe
endElement
in interface org.xml.sax.ContentHandler
endElement
in class AbstractSynchronizedXMLPipe
s
- The Namespace URI, or the empty string if the element has no
Namespace URI or if Namespace
processing is not being performed.s1
- The local name (without prefix), or the empty string if
Namespace processing is not being performed.s2
- The raw XML 1.0 name (with prefix), or the empty string if
raw names are not available.
org.xml.sax.SAXException
protected void abortRecordCapture()
protected void handleErrors(java.lang.String errorMsg)
errorMsg
- protected abstract void prepareRecordCapture() throws org.xml.sax.SAXException
org.xml.sax.SAXException
protected abstract boolean shouldHarvestDocument()
protected abstract void captureRecord() throws java.lang.Exception
java.lang.Exception
protected abstract void prepareRecordForDeletion()
protected abstract void prepareResourceFromUrlIdentifierCapture()
currentMetadtaUrlIdentifier
,
identifierName
protected abstract void captureResourceFromUrlIdentifier()
currentMetadtaUrlIdentifier
,
identifierName
protected abstract boolean storeHarvestedData() throws java.lang.Exception
java.lang.Exception
protected abstract void storeFailedHarvestData(java.lang.Exception e)
protected abstract void handleResumptionToken()
protected abstract void saveCriticalFields(boolean dataHarvested) throws org.xml.sax.SAXException
dataHarvested
- boolean indicating data was harvested
org.xml.sax.SAXException
protected abstract void resetRecordCaptureFields(boolean deleteDoc)
deleteDoc
- protected void resetAllFields()
public void recycle()
recycle
in interface org.apache.avalon.excalibur.pool.Recyclable
recycle
in class AbstractSynchronizedXMLProducer
protected void resetResumptionToken()
public void receiveSynchronizedRequest(java.lang.String url)
receiveSynchronizedRequest
in interface OAIHarvester
public void receiveSynchronizedRequest(java.lang.String url, java.lang.String originalRequestUrl)
receiveSynchronizedRequest
in interface OAIHarvester
url
- : the url wich represent the requestoriginalRequestUrl
- : the original requestpublic void receiveRequest(java.lang.String url)
receiveRequest
in interface OAIHarvester
url
- protected org.apache.avalon.framework.parameters.Parameters getHarvestParameters()
public void close()
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |