feat: add entity analytics, pluggable cache, and pitfalls docs

- Add Cache interface with MemoryCache and NoopCache implementations
- Make SocialhoseClient accept injectable cache option
- Remove Next.js next.revalidate coupling from transport
- Add entity analytics: getEntityBrief, getEntityStats, getEntityBriefs,
  getCampaignIdByMatch with exact sentiment/platform faceting and
  cumulative-differenced timeline
- Add 23 new tests (29 total) covering entity analytics, cache injection,
  sentiment reconciliation, bounded concurrency, and timeline differencing
- Update README with entity analytics, custom caching, and pitfalls sections
- Fix CI branch: main -> master
This commit is contained in:
Mo Elzubeir
2026-05-29 13:10:05 -05:00
parent e34552ac33
commit 252ea713b1
7 changed files with 836 additions and 30 deletions
+80 -5
View File
@@ -44,8 +44,11 @@ The SDK sends `Authorization: Api-Key <key>` and a browser-like `User-Agent` by
## Endpoints
### Campaigns
- `getCampaigns()`
- `getCampaign(id)`
### Analytics
- `getOverview(filters)`
- `getTimeline(filters)`
- `getSentiment(filters)`
@@ -54,11 +57,23 @@ The SDK sends `Authorization: Api-Key <key>` and a browser-like `User-Agent` by
- `getTopKeywords(filters)`
- `getTrending(filters)`
- `getTopMentions(filters)`
### Mentions
- `getMentions(filters)`
### Mailing Lists
- `getMailingLists()`
- `inviteMailingListMember(listId, invite)`
- `get(path, params)` for lower-level GET access
- `post(path, body)` for lower-level POST access
### Entity Analytics
- `getEntityBrief(term, campaignId?)` — one request: count + top-20 engagement sample with derived sentiment/platform
- `getEntityStats(term, campaignId?)` — full dashboard: exact sentiment faceting, exact platform mix, 14-day cumulative-differenced timeline, 7d momentum
- `getEntityBriefs(terms, campaignId?, concurrency?)` — batch entity resolution with bounded concurrency (default 20)
- `getCampaignIdByMatch(substring)` — resolve a live campaign ID by matching its name
### Low-Level
- `get(path, params)` for direct GET access
- `post(path, body)` for direct POST access
## Filtering examples
@@ -77,14 +92,74 @@ await socialhose.getTimeline({
});
```
## Next.js cache integration
## Custom Caching
Pass `revalidateSeconds` per request. In Next.js this is forwarded as `fetch(..., { next: { revalidate } })`; outside Next.js it is harmless.
The SDK ships with `MemoryCache` (in-memory, per-entry TTL) and `NoopCache` (no caching). You can inject your own by implementing the `Cache` interface:
```ts
await socialhose.getMentions({ content_search: 'ozempic' }, { revalidateSeconds: 3600 });
import { SocialhoseClient, Cache } from '@socialhose/api';
class RedisCache implements Cache {
async get(key: string) { /* redis.get(key) */ }
async set(key: string, value: unknown, ttlMs: number) { /* redis.set(key, value, 'PX', ttlMs) */ }
async delete(key: string) { /* redis.del(key) */ }
}
const socialhose = new SocialhoseClient({
apiKey: process.env.SOCIALHOSE_API_KEY!,
cache: new RedisCache(),
});
```
Pass `revalidateSeconds` per request to control per-call TTL in your cache implementation:
```ts
await socialhose.getMentions(
{ content_search: 'ozempic' },
{ revalidateSeconds: 3600 },
);
```
## Entity Analytics
Search across all mentions for a specific term, person, or organization:
```ts
// Quick count + top mentions
const brief = await socialhose.getEntityBrief('Burhan');
console.log(brief.total, brief.sentiment, brief.platformMix);
// Full dashboard: exact distributions, timeline, momentum
const stats = await socialhose.getEntityStats('RSF', 'campaign-id');
console.log(
stats.total,
stats.momentumPct, // last 7 days vs prior 7
stats.sentiment, // exact (facets reconcile) or estimated (from sample)
stats.sparkline, // 14-day daily volume
);
// Batch resolve many entities with bounded concurrency
const briefs = await socialhose.getEntityBriefs(
['Burhan', 'Hemedti', 'SAF', 'RSF'],
'campaign-id',
10, // concurrency
);
```
Entity analytics fan out multiple requests per entity (sentiment faceting: 3 calls; platform mix: 6 calls; timeline: 15 calls). Set `cacheTtlMs` or inject a persistent cache to stay under the ~60 req/min rate limit.
## Pitfalls
**Cloudflare UA blocking.** The Socialhose API sits behind Cloudflare, which rejects some non-browser User-Agent headers. The SDK defaults to a Chrome 124 UA — don't change it unless you've verified the new UA works.
**Entity timeline uses cumulative differencing.** The analytics timeline endpoint is campaign-scoped and ignores `content_search`. The SDK facets `/mentions/` by day using cumulative `date_from`-only queries and subtracts consecutive counts. This avoids the API's `date_to` inclusivity bug: overlapping `[date_from, date_to]` windows share a day and double-count it. Don't "simplify" this to day windows.
**Sentiment reconciliation checks.** The `getEntityStats` exact sentiment and platform distributions validate that facet counts reconcile with the known total (e.g., positive + negative + neutral === total). If they don't match, the API silently dropped the `content_search` filter — the SDK falls back to estimates from the brief's sample rather than showing wrong data.
**Rate limit ~60 req/min per API key.** Entity analytics fan out many parallel requests. Use the `cache` option with a persistent store (Redis, Next.js Data Cache) to keep warm loads under the limit.
**Never expose the API key to the browser.** This SDK is designed for server-side use. Always set `SOCIALHOSE_API_KEY` as a server-side environment variable.
## Errors
Failed requests throw `SocialhoseError` with `status`, `path`, and response `body` when available.