Altapay - [INVESTIGATION] Gateway Processing Issue – Incident details

[INVESTIGATION] Gateway Processing Issue

Resolved
Partial outage
Started almost 4 years agoLasted about 1 hour

Affected

Processing

Operational from 7:52 AM to 7:52 AM, Partial outage from 7:52 AM to 9:06 AM

Updates
  • Update
    Update

    Scope of Incident Altapay experienced reduced operability and high Latency of payments processing from 8:40 to 9:27 Analysis of incident: * Load was not distributed evenly across databases * One database reached 100% load * Although this should have affected only a small amount of requests, a limited proxy configuration forced processors to wait for the transactions affected by the database to resolve, dropping new connections beyond the number of processors, increasing the impacted customer base * The system was able to process, but it created 504 and high response times for some payments ‌ Remedial Action * A study of load was performed on the spot and increased performance in the database affected * The processor configuration was amended * Monitoring was in place, but was configured as the database cluster as a whole, and not per active instances, as part of the cluster. The configuration has now been changed, to be per node level instead of accumulated total cluster level usage.

  • Resolved
    Resolved

    The issue with the affected database has been resolved entirely and we see gateways continue to process as intended. All systems will be monitored closely throughout the day to ensure the continued stability of the payment gateways.

  • Update
    Update

    We are continuing to monitor for any further issues.

  • Monitoring
    Monitoring

    A fix has been implemented and the gateways are processing again. We will continue monitoring the activity and stability very closely. A new update will follow within the next 30 minutes

  • Identified
    Identified

    The issue has been identified as a database causing issues for a smaller number of gateways, which had the side effect of causing issues all the way to the front-end for multiple gateways. We are currently working on mitigating the issue and currently estimate the issue to be resolved within 15-30 minutes. A new update will follow within 30 minutes.

  • Investigating
    Investigating

    A possible processing issue with the Gateway is being investigated. Investigations are ongoing and we will keep you updated. We apologize for any inconvenience this issue is causing.