Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
169 changes: 169 additions & 0 deletions CONFIGURATION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
# Configuration Guide

This document describes the available configuration options for the Coveo Push API Java Client.

## Batch Size Configuration

The batch size controls how much data is accumulated before creating a file container and pushing to Coveo. The default is **5 MB**. The maximum allowed is **256 MB** (Stream API limit).

### Configuration Methods

There are two ways to configure the batch size:

#### 1. System Property (Runtime Configuration)

Set the `coveo.push.batchSize` system property to configure the default batch size globally for all service instances:

**Java Command Line:**

```bash
java -Dcoveo.push.batchSize=134217728 -jar your-application.jar
```

**Within Java Code:**

```java
// Set before creating any service instances
System.setProperty("coveo.push.batchSize", "134217728"); // 128 MB in bytes
```

**Maven/Gradle Build:**

```xml
<!-- pom.xml -->
<properties>
<argLine>-Dcoveo.push.batchSize=134217728</argLine>
</properties>
```

```groovy
// build.gradle
test {
systemProperty 'coveo.push.batchSize', '134217728'
}
```

**Example Values:**

- `5242880` = 5 MB (default)
- `268435456` = 256 MB (maximum)
- `134217728` = 128 MB
- `67108864` = 64 MB
- `33554432` = 32 MB
- `10485760` = 10 MB

#### 2. Constructor Parameter (Per-Instance Configuration)

Pass the `maxQueueSize` parameter when creating service instances:

```java
// UpdateStreamService with custom 128 MB batch size
UpdateStreamService service = new UpdateStreamService(
catalogSource,
backoffOptions,
null, // userAgents (optional)
128 * 1024 * 1024 // 128 MB in bytes
);

// PushService with custom batch size
PushService pushService = new PushService(
pushEnabledSource,
backoffOptions,
128 * 1024 * 1024 // 128 MB
);

// StreamService with custom batch size
StreamService streamService = new StreamService(
streamEnabledSource,
backoffOptions,
null, // userAgents (optional)
128 * 1024 * 1024 // 128 MB
);
```

### Configuration Priority

When both methods are used:

1. **Constructor parameter** takes precedence (if specified)
2. **System property** is used as default (if set)
3. **Built-in default** of 5 MB is used otherwise

### Validation Rules

All batch size values are validated:

- ✅ **Maximum:** 256 MB (268,435,456 bytes) - API limit
- ✅ **Minimum:** Greater than 0
- ❌ Values exceeding 256 MB will throw `IllegalArgumentException`
- ❌ Invalid or negative values will throw `IllegalArgumentException`

### Examples

#### Example 1: Using System Property

```java
// Configure globally via system property
System.setProperty("coveo.push.batchSize", "134217728"); // 128 MB

// All services will use 128 MB by default
UpdateStreamService updateService = new UpdateStreamService(catalogSource, backoffOptions);
PushService pushService = new PushService(pushEnabledSource, backoffOptions);
StreamService streamService = new StreamService(streamEnabledSource, backoffOptions);
```

#### Example 2: Override Per Service

```java
// Set global default to 128 MB
System.setProperty("coveo.push.batchSize", "134217728");

// Update service uses global default (128 MB)
UpdateStreamService updateService = new UpdateStreamService(catalogSource, backoffOptions);

// Push service overrides with 64 MB
PushService pushService = new PushService(pushEnabledSource, backoffOptions, 64 * 1024 * 1024);

// Stream service uses global default (128 MB)
StreamService streamService = new StreamService(streamEnabledSource, backoffOptions);
```

### When to Adjust Batch Size

**Use smaller batches (32-64 MB) when:**

- Network bandwidth is limited
- Memory is constrained
- Processing many small documents
- You want more frequent progress updates

**Use larger batches (128-256 MB) when:**

- Network bandwidth is high
- Processing large documents or files
- You want to minimize API calls
- Maximum throughput is needed

**Keep default (5 MB) when:**

- You're unsure
- Memory is a concern
- You want predictable, frequent pushes

### Configuration Property Reference

| Property Name | Description | Default Value | Valid Range |
| ---------------------- | --------------------------- | ---------------- | -------------- |
| `coveo.push.batchSize` | Default batch size in bytes | `5242880` (5 MB) | 1 to 268435456 |

## Additional Configuration

### Environment Variables

The following environment variables can be used for general configuration:

- `COVEO_API_KEY` - API key for authentication
- `COVEO_ORGANIZATION_ID` - Organization identifier
- `COVEO_PLATFORM_URL` - Custom platform URL (if needed)

Refer to the Coveo Platform documentation for complete environment configuration options.
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,14 @@ public class PushOneDocument {
}
```

## Configuration

### Batch Size Configuration

The SDK uses a default batch size of **5 MB** before automatically creating a file container and pushing documents. The maximum allowed batch size is **256 MB** (matching the Coveo Stream API limit). You can configure this globally via system property or per-service via constructor.

For complete configuration details, examples, and best practices, see **[CONFIGURATION.md](CONFIGURATION.md)**.

### Exponential Backoff Retry Configuration

By default, the SDK leverages an exponential backoff retry mechanism. Exponential backoff allows for the SDK to make multiple attempts to resolve throttled requests, increasing the amount of time to wait for each subsequent attempt. Outgoing requests will retry when a `429` status code is returned from the platform.
Expand Down
Binary file added samples/ConfigureBatchSize.class
Binary file not shown.
49 changes: 49 additions & 0 deletions samples/ConfigureBatchSize.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
import com.coveo.pushapiclient.*;
import com.coveo.pushapiclient.exceptions.NoOpenFileContainerException;

import java.io.IOException;

/**
* Demonstrates how to configure the batch size for document uploads.
*
* The batch size controls how much data accumulates before automatically
* creating a file container and pushing to Coveo. Default is 5 MB, max is 256 MB.
*/
public class ConfigureBatchSize {

public static void main(String[] args) throws IOException, InterruptedException, NoOpenFileContainerException {

PlatformUrl platformUrl = new PlatformUrlBuilder()
.withEnvironment(Environment.PRODUCTION)
.withRegion(Region.US)
.build();

CatalogSource catalogSource = CatalogSource.fromPlatformUrl(
"my_api_key", "my_org_id", "my_source_id", platformUrl);

// Option 1: Use default batch size (5 MB)
// This creates an UpdateStreamService with the built-in 5 MB limit
UpdateStreamService defaultService = new UpdateStreamService(catalogSource);

// Option 2: Configure batch size via constructor (50 MB)
// Pass the custom batch size directly as an integer parameter
int fiftyMegabytes = 50 * 1024 * 1024;
UpdateStreamService customService = new UpdateStreamService(
catalogSource,
new BackoffOptionsBuilder().build(),
null,
fiftyMegabytes);

// Option 3: Configure globally via system property (affects all services)
// Run with: java -Dcoveo.push.batchSize=52428800 ConfigureBatchSize
// This sets 50 MB for all service instances that don't specify a size
// This approach allows configuration at runtime without code changes

// Use the service
DocumentBuilder document = new DocumentBuilder("https://my.document.uri", "My document title")
.withData("these words will be searchable");

customService.addOrUpdate(document);
customService.close();
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
package com.coveo.pushapiclient;

import com.google.gson.Gson;
import java.io.IOException;
import java.net.http.HttpResponse;
import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;

class CatalogStreamUploadHandler implements StreamUploadHandler {
private static final Logger logger = LogManager.getLogger(CatalogStreamUploadHandler.class);
private final StreamEnabledSource source;
private final PlatformClient platformClient;

CatalogStreamUploadHandler(StreamEnabledSource source, PlatformClient platformClient) {
this.source = source;
this.platformClient = platformClient;
}

@Override
public HttpResponse<String> uploadAndPush(StreamUpdate stream)
throws IOException, InterruptedException {
// Step 1: Create file container
logger.debug("Creating file container for stream upload");
HttpResponse<String> containerResponse = platformClient.createFileContainer();
FileContainer container = new Gson().fromJson(containerResponse.body(), FileContainer.class);

// Step 2: Upload content to container
String batchUpdateJson = new Gson().toJson(stream.marshal());
logger.debug("Uploading stream content to file container: {}", container.fileId);
platformClient.uploadContentToFileContainer(container, batchUpdateJson);

// Step 3: Push container to stream source
logger.info("Pushing file container to stream source: {}", source.getId());
return platformClient.pushFileContainerContentToStreamSource(source.getId(), container);
}
}
23 changes: 22 additions & 1 deletion src/main/java/com/coveo/pushapiclient/PushService.java
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,32 @@ public PushService(PushEnabledSource source) {
}

public PushService(PushEnabledSource source, BackoffOptions options) {
this(source, options, DocumentUploadQueue.getConfiguredBatchSize());
}

/**
* Creates a new PushService with configurable batch size.
*
* <p>Example batch sizes in bytes:
*
* <ul>
* <li>5 MB (default): {@code 5 * 1024 * 1024} = {@code 5242880}
* <li>50 MB: {@code 50 * 1024 * 1024} = {@code 52428800}
* <li>256 MB (max): {@code 256 * 1024 * 1024} = {@code 268435456}
* </ul>
*
* @param source The source to push documents to.
* @param options The configuration options for exponential backoff.
* @param maxQueueSize The maximum batch size in bytes before auto-flushing (default: 5MB, max:
* 256MB).
* @throws IllegalArgumentException if maxQueueSize exceeds 256MB or is not positive.
*/
public PushService(PushEnabledSource source, BackoffOptions options, int maxQueueSize) {
String apiKey = source.getApiKey();
String organizationId = source.getOrganizationId();
PlatformUrl platformUrl = source.getPlatformUrl();
UploadStrategy uploader = this.getUploadStrategy();
DocumentUploadQueue queue = new DocumentUploadQueue(uploader);
DocumentUploadQueue queue = new DocumentUploadQueue(uploader, maxQueueSize);

this.platformClient = new PlatformClient(apiKey, organizationId, platformUrl, options);
this.service = new PushServiceInternal(queue);
Expand Down
Original file line number Diff line number Diff line change
@@ -1,17 +1,21 @@
package com.coveo.pushapiclient;

import java.io.IOException;
import java.net.http.HttpResponse;
import java.util.ArrayList;
import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;

public class StreamDocumentUploadQueue extends DocumentUploadQueue {

private static final Logger logger = LogManager.getLogger(StreamDocumentUploadQueue.class);
private StreamUploadHandler streamHandler;
protected ArrayList<PartialUpdateDocument> documentToPartiallyUpdateList;
private HttpResponse<String> lastResponse;

public StreamDocumentUploadQueue(UploadStrategy uploader) {
super(uploader);
public StreamDocumentUploadQueue(StreamUploadHandler handler, int maxQueueSize) {
super(null, maxQueueSize);
this.streamHandler = handler;
this.documentToPartiallyUpdateList = new ArrayList<>();
}

Expand All @@ -25,13 +29,19 @@ public StreamDocumentUploadQueue(UploadStrategy uploader) {
public void flush() throws IOException, InterruptedException {
if (this.isEmpty()) {
logger.debug("Empty batch. Skipping upload");
this.lastResponse = null;
return;
}
// TODO: LENS-871: support concurrent requests
StreamUpdate stream = this.getStream();
logger.info("Uploading document Stream");
this.uploader.apply(stream);

this.lastResponse = this.streamHandler.uploadAndPush(stream);

clearQueue();
}

private void clearQueue() {
this.size = 0;
this.documentToAddList.clear();
this.documentToDeleteList.clear();
Expand Down Expand Up @@ -78,4 +88,13 @@ public BatchUpdate getBatch() {
public boolean isEmpty() {
return super.isEmpty() && documentToPartiallyUpdateList.isEmpty();
}

/**
* Returns the HTTP response from the last flush operation.
*
* @return The last response, or null if no flush has occurred or queue was empty.
*/
HttpResponse<String> getLastResponse() {
return this.lastResponse;
}
}
Loading