Poor Man’s Queue

When and How to Use a Database-Based Queue

Broadly speaking, a queue is an asynchronous processing mechanism that allows us to defer tasks that may require heavy processing or depend on remote calls and unstable resources, making them prone to failures and retries. There are many approaches to implementing queues, but they all share the need for a storage engine to keep track of enqueued jobs. Regardless of the storage used, implementations can be designed with varying levels of complexity, flexibility, and scalability. However, we don’t always need a solution that optimizes all these aspects; in many cases, a simple, quick, and disposable alternative is enough to solve an immediate problem.

The Poor Man’s Queue implementation I’ll discuss consists of simply adding a field to database records to indicate whether they have been processed. A scheduled process or daemon performs periodic polling to retrieve, process, and update the enqueued records. Let’s analyze the pros and cons of this approach.

Pros

Quick and simple implementation: No additional infrastructure is required, only the existing database.
Ideal for prototypes or temporary solutions: Solves urgent problems without significant investment.
Easy monitoring and debugging: The status of jobs can be inspected directly in the database.

Cons

Limited scalability: The database is not optimized for handling high-concurrency queues.
Polling inefficiency: Unnecessary periodic queries can add extra load.
Lack of strong guarantees: Without native retry mechanisms or robust persistence, jobs may get stuck.

Implementation

For the basic implementation of a Poor Man's Queue, we will use a table in the database with a field that allows us to identify whether a record is enqueued or has already been processed. We will also implement a process that performs polling and updates the processed records.

1. Table Design

Let’s assume we want to process users pending verification. Our users table could have a specific column to manage the queue:

ALTER TABLE users ADD COLUMN queued_at TIMESTAMP NULL;

queued_at: Used instead of a boolean (queued = true/false) because it allows us to store when a job was enqueued, which aids in metrics and debugging.
A NULL value indicates that the user is not in the queue.
A date value indicates that the user is in the queue and since when. We can also add a processed_at column to record when the processing was completed:

ALTER TABLE users ADD COLUMN processed_at TIMESTAMP NULL;

2. Enqueue Records

When a user requires processing, we mark them as enqueued:

UPDATE users SET queued_at = NOW() WHERE id = 123;

If we have already existing users, we need to determine if they have already been processed in order to queue them, or if only new users need to be queued. If we set a processed_at value, we could use it to queue them:

UPDATE users SET queued_at = NOW() WHERE queued_at IS NULL AND processed_at IS NULL;

3. Consumption Process (Polling Worker)

The worker that processes the records runs periodically and follows these steps:

Select enqueued users, avoiding concurrent locks.
Process them (e.g., validation, document generation, email sending, etc.).
Update the status in the database.

Example in SQL to select enqueued users, preventing another worker from processing the same ones:

WITH to_process AS (
    SELECT id FROM users 
    WHERE queued_at IS NOT NULL AND processed_at IS NULL
    ORDER BY queued_at ASC
    LIMIT 10
    FOR UPDATE SKIP LOCKED
)
UPDATE users
SET processed_at = NOW(), queued_at = NULL
WHERE id IN (SELECT id FROM to_process);

Here we are:

Selecting a batch of 10 records with LIMIT 10 to avoid overloading the system.
Using FOR UPDATE SKIP LOCKED to prevent concurrency locks.
Processing and marking the records as completed.

Not all SQL engines support SKIP LOCKED natively, but many have mechanisms for row-level locking that can be used in combination with other techniques to achieve similar outcomes.

4. Implementation in Laravel

If we are using Laravel, we can create an Artisan command to run the processing:

Artisan::command('queue:process-users', function () {
    DB::transaction(function () {
        $users = DB::table('users')
            ->whereNotNull('queued_at')
            ->whereNull('processed_at')
            ->orderBy('queued_at')
            ->limit(10)
            ->lockForUpdate()
            ->get();

        foreach ($users as $user) {
            // Simulating processing
            \Log::info("Processing user: {$user->id}");
            
            DB::table('users')
                ->where('id', $user->id)
                ->update([
                    'processed_at' => now(),
                    'queued_at' => null
                ]);
        }
    });
})->describe('Processes users in the queue.');

This command can be run periodically with cron:

* * * * * php artisan queue:process-users

Limitations and Issues of a Poor Man’s Queue

While this approach is useful due to its simplicity, it comes with several limitations and problems that can become critical as the system grows. Below, we detail the main drawbacks and strategies to mitigate them.

1. Inefficient Polling

Unlike specialized systems like Redis, RabbitMQ, or SQS, a Poor Man's Queue based on a database lacks an automatic notification mechanism for new enqueued jobs. This forces workers to perform periodic polling, which can lead to:

Unnecessary queries if no new records are available.
Additional load on the database, affecting performance for other operations.

Mitigation

Reduce polling frequency when the workload is low.
Use events within the application to reduce latency, rather than relying solely on polling.
Implement a separate jobs table instead of overloading the main data tables.

2. Concurrency and Locking

If multiple processes attempt to process the same records at the same time, it can lead to locking or duplicate processing. This happens because the database is not designed to function as a distributed queue.

Mitigation

Use FOR UPDATE SKIP LOCKED when selecting records to prevent multiple workers from processing the same job.
Add a processing_at column to mark records that are currently being worked on, preventing others from picking them up.
Design the process to be idempotent, meaning that if a record is processed twice, the result will remain the same.

3. Failed and Stuck Jobs

If a process fails before completing a task, the record may remain indefinitely marked as "queued," preventing it from being processed again.

Mitigation

Add a failed_at column to mark jobs that have failed.
Set up a retry strategy, for example:

UPDATE users SET queued_at = NOW(), failed_at = NULL 
WHERE failed_at IS NOT NULL AND retries < 3;

Implement an alert policy to notify when failed jobs are accumulating.

4. Limited Scalability

As the volume of jobs grows, the database can become a bottleneck. Each enqueue and dequeue operation involves queries and writes to the database, which can degrade its performance.

Mitigation

Separate the work tables into a distinct database or a dedicated queue table.
Optimize indexes on the queued_at and processed_at columns to speed up queries.
Migrate to a dedicated queue system (e.g., Redis, RabbitMQ, SQS) if the workload significantly increases.

5. Lack of History Persistence

Specialized queue systems often provide metrics on processed jobs, execution times, and error rates. In a Poor Man’s Queue, this information is typically lost once a record is updated.

Mitigation

Keep a history of processed jobs in a separate table.
Log detailed information about each execution.

Evolution Strategies: From a Poor Man’s Queue to a More Scalable System

While a Poor Man’s Queue can be useful in certain scenarios, there comes a point where load, concurrency, or the need for more control make this solution inefficient. To avoid critical issues in the future, it's advisable to design the system with clear evolution paths. Here, we explore several strategies for gradually migrating towards more scalable solutions.

1. Separate Queue Logic into a Specific Table

In the basic implementation, queue markers (queued_at, processed_at, etc.) are often added directly to the business tables (e.g., users, orders). This can cause performance issues and make maintenance harder.

Migration Strategy Create a dedicated queue table, for example:

CREATE TABLE queue_jobs (
    id SERIAL PRIMARY KEY,
    entity_id INT NOT NULL,
    entity_type VARCHAR(50) NOT NULL,
    status VARCHAR(20) NOT NULL DEFAULT 'queued', 
    attempts INT NOT NULL DEFAULT 0,
    queued_at TIMESTAMP NOT NULL DEFAULT NOW(),
    processed_at TIMESTAMP NULL,
    failed_at TIMESTAMP NULL
);

Modify the system so that, instead of updating the main tables, it inserts records into this new table. This allows for easier scaling and data cleanup without affecting the business tables.

2. Optimize the Polling Process

Inefficient polling can generate unnecessary queries. Before migrating to an event-based system, optimizations can be implemented in the database.

Migration Strategy

Use LIMIT and ORDER BY queued_at ASC to avoid reading too many records at once.
Implement FOR UPDATE SKIP LOCKED to safely take jobs without unnecessary locks.

SELECT * FROM queue_jobs 
WHERE status = 'queued'
ORDER BY queued_at ASC
LIMIT 10 
FOR UPDATE SKIP LOCKED;

Adjust the polling frequency based on system load.

3. Implement a Notification System Instead of Polling

Continuous polling can be replaced with a notification strategy to reduce the load on the database.

Migration Strategy

Database Triggers: Use LISTEN/NOTIFY in PostgreSQL to alert workers when new jobs are queued. In MySQL, use events or triggers to send signals to an external system.
Application Events: Dispatch internal events (e.g., in Laravel, use dispatch() instead of polling). Integrate with WebSockets or simple queues like Redis Pub/Sub.

4. Migrate to a Distributed Queue System

When the volume of work grows or high availability is required, it’s advisable to migrate to a specialized solution.

Migration Strategy

Use Redis as an Intermediate Step: Redis provides fast in-memory queues that are easy to integrate. Frameworks like Laravel and Django offer native support for Redis in their queue systems.
Migrate to a Message Broker (RabbitMQ, Kafka, SQS, etc.): As jobs increase in complexity and need distribution across multiple services, a more robust messaging solution becomes necessary.

Migration Phases

Have processes write to Redis instead of the database.
Adapt workers to read from Redis instead of SQL.
Finally, migrate queue management to an external service (e.g., AWS SQS or Kafka).

5. Make Jobs Idempotent

Regardless of the queue system, jobs should be designed so that if they fail and are retried, the result remains consistent.

Migration Strategy

Use unique identifiers (job_id) for each job to prevent duplicate executions.
Implement optimistic locking (e.g., processed_at IS NULL when updating).
Ensure operations are safe in the case of retries.

Examples of Real-World Use Cases

🔹 Temporary Process A team needs to run an ad-hoc command with a high processing load, but no permanent solution is required. For example, processing a million users in a bulk update task. A table with a queued column is created, populated with the users to process, and an iterative process handles them until the task is completed. Once finished, the table and code can be removed without affecting the application's implementation.

🔹 Emergency In an online store, some orders fail because payment confirmation requires a call to an external system that sometimes doesn’t respond. This causes customers to pay, but the order is not recorded correctly. Implementing a more advanced queue system is not feasible immediately, and the problem must be resolved quickly. Using Poor Man’s Queue allows deferring confirmation and retrying in case of failure, mitigating the impact on customers.

🔹 Prototype A new service is in the experimentation phase, and its exact needs are not yet clear. Implementing a robust queue system from the beginning would be costly and premature. Instead, Poor Man’s Queue provides a quick, functional solution that can evolve into a more scalable design once the requirements become clearer.

<< back to Code