# Backend Critical Issues & TODOs This document outlines critical issues found in the backend codebase that require immediate attention. The issues range from severe data integrity risks to security vulnerabilities and poor API design. --- ### 1. Catastrophic Data Loss via Cascading Deletes **Severity: CRITICAL** **Problem:** The SQLAlchemy models in `be/app/models.py` are configured with `ondelete="CASCADE"` and `cascade="all, delete-orphan"` on relationships to critical shared data like `Expense`, `List`, `Settlement`, and `Chore`. This creates a chain reaction of data destruction that can be triggered by a single user's action, affecting the data of all other users in a group. **Specific Scenarios:** - **Deleting a User:** When a user is deleted, all expenses they paid for, created, or were a part of (via `ExpenseSplit`) are deleted. All settlements they participated in are also deleted. This permanently corrupts the financial history of every group the user was a member of, violating the data integrity for all other members. - **Deleting a Group:** Deleting a group permanently deletes **all** associated data, including every expense, shopping list, item, settlement, and the entire chore history. - **Deleting a List:** Deleting a shopping list deletes all associated expenses. **Recommendation:** - Immediately remove all `ondelete="CASCADE"` and `cascade="all, delete-orphan"` configurations from the models where they affect shared data. - Implement a **soft-delete** or **anonymization** strategy for user deletion to comply with data protection regulations (like GDPR's "right to be forgotten") without destroying the integrity of other users' data. When a user is "deleted," their PII should be scrubbed, but their ID and associated records should remain to keep financial history intact. --- ### 2. Hidden Group Deletion in "Leave Group" Endpoint **Severity: CRITICAL** **Problem:** There is no explicit `DELETE /groups/{group_id}` endpoint. Instead, the logic for deleting a group is hidden within the `DELETE /groups/{group_id}/leave` endpoint in `be/app/api/v1/endpoints/groups.py`. If the last remaining member of a group, who is also the owner, calls this endpoint, the entire group and all its associated data are silently and permanently deleted. **Why this is a problem:** - **Non-Obvious Destruction:** A user would never expect a "leave" action to be destructive for the entire group. - **No Confirmation:** There is no "are you sure?" check. The action is immediate and irreversible. - **High Risk of Accidental Deletion:** An owner could easily remove all members and then leave, accidentally wiping all data. **Recommendation:** - Create a separate, explicit `DELETE /groups/{group_id}` endpoint. - This new endpoint must be protected and should only be accessible to the group owner. - Implement multiple confirmation steps before allowing the deletion (e.g., requiring the user to type the group's name to confirm). - The "leave group" endpoint should **never** result in the deletion of the group. If the last member leaves, the group should be archived or left empty, but not destroyed. --- ### 3. User Deletion Triggers Data Cascade **Severity: CRITICAL** **Problem:** The user management routes in `be/app/api/v1/endpoints/users.py` use the `fastapi-users` library. The configuration in `be/app/auth.py` uses the default `BaseUserManager`, which performs a **hard delete** on the user record in the database. **Why this is a problem:** This is the trigger for the catastrophic data loss described in issue #1. A simple call to the `DELETE /users/me` endpoint will initiate the cascade, destroying shared financial records across multiple groups. This is a direct violation of data integrity for other users. **Recommendation:** - Override the default `delete` method in the `UserManager` class in `be/app/auth.py`. - The new method must implement the anonymization strategy discussed in issue #1. It should scrub the user's PII and mark the account as inactive, but **not** delete the row from the database. --- ### 4. Inconsistent and Flawed Authorization in Financial Endpoints **Severity: HIGH** **Problem:** Authorization logic is inconsistent across the financial endpoints in `be/app/api/v1/endpoints/financials.py`, creating security gaps and potential for misuse. **Specific Scenarios:** - **Creating vs. Updating Expenses:** A regular user can't create an expense paid by someone else, but they _can_ modify an existing expense as long as they were the original payer. This is inconsistent. A better approach would be to only allow group **owners** to modify financial records created by others. - **Deleting Settlements:** The `delete_settlement_record` endpoint dangerously allows the user who _created_ the settlement record to delete it. A user could pay a debt, have the recipient confirm it, and then the payer could simply delete the record of the payment. Only group **owners** should have the power to delete financial records, and it should be a heavily audited action. **Recommendation:** - Establish a clear and consistent authorization policy. A good default policy would be: 1. Users can create and modify their own financial entries (e.g., expenses they paid for). 2. Only group **owners** can modify or delete financial entries created by other users. - Refactor all financial endpoints (`expenses`, `settlements`) to strictly and consistently enforce this policy. --- ### 5. No Protection Against Over-Settlement of Debts **Severity: HIGH** **Problem:** The `record_settlement_for_expense_split` endpoint in `financials.py` does not validate if the settlement amount exceeds the amount owed for that expense split. **Why this is a problem:** A malicious user could exploit this by "settling" a $5 debt with a $500 payment. The system would record this, creating a false record that the original payer now owes the malicious user $495. This can be used to create phantom debts. **Recommendation:** - In `crud_settlement_activity.create_settlement_activity`, add validation to ensure the `amount_paid` in a settlement activity does not exceed the remaining `owed_amount` on the `ExpenseSplit`. - If an overpayment is attempted, the API should return a `400 Bad Request` error with a clear message. --- ### 6. Flawed Business Logic Preventing Group Expenses from Personal Lists **Severity: MEDIUM** **Problem:** The `create_new_expense` endpoint prevents a user from creating a group expense that is associated with one of their personal shopping lists. **Why this is a problem:** This doesn't reflect a real-world use case. A user might use a personal list to do shopping for a group and then need to file the expense against that group. The current logic forces an artificial separation. **Recommendation:** - Remove the validation check that blocks this action. A user should be able to link a group expense to any list they have access to, regardless of whether it's a personal or group list. The critical factor is the `group_id` on the expense itself, not on the list it's optionally linked to. --- ### 7. Unrestricted Access to All Chore History **Severity: CRITICAL** **Problem:** The chore history endpoints in `be/app/api/v1/endpoints/chores.py` lack any authorization checks. **Specific Endpoints:** - `GET /{chore_id}/history` - `GET /assignments/{assignment_id}/history` Any authenticated user can access the historical data for _any_ chore or assignment in the entire database simply by guessing its ID. This allows users from one group to spy on the activities of other groups, which is a major data leak. **Recommendation:** - Add strict authorization checks to these endpoints immediately. - Before returning any history, the code must verify that the `current_user` is a member of the group to which the chore or assignment belongs. - For personal chores, it must verify that the `current_user` is the creator of the chore. --- ### 8. Redundant and Confusing Chore Endpoints **Severity: LOW** **Problem:** The API has a completely separate set of endpoints for "Personal Chores" (e.g., `POST /personal`, `DELETE /personal/{chore_id}`). It also provides generic endpoints (`PUT /{chore_id}`) that can perform the same actions and more, such as converting a personal chore to a group chore. **Why this is a problem:** - **API Bloat:** It makes the API larger and more confusing than necessary. - **Maintenance Burden:** Having duplicate logic increases the chance of bugs and makes the codebase harder to maintain. For example, if a business rule for updates changes, it might need to be updated in multiple places. - **Inconsistency:** The `update_personal_chore` and `update_group_chore` endpoints have slightly different logic and permission checks than the generic `update_chore_any_type` endpoint, which is a recipe for subtle bugs. **Recommendation:** - Deprecate and remove the separate `/personal/*` and `/groups/{group_id}/chores/*` endpoints for `POST`, `PUT`, and `DELETE`. - Consolidate all create, update, and delete logic into a single set of endpoints (e.g., `POST /chores`, `PUT /chores/{chore_id}`, `DELETE /chores/{chore_id}`). - This single set of endpoints should contain clear, unified logic that handles permissions based on the chore's `type` (personal or group). --- ### 9. Sensitive User Data Leaked in Application Logs **Severity: MEDIUM** **Problem:** Across multiple endpoints (`financials.py`, `groups.py`, `chores.py`, etc.) the code logs user email addresses and sometimes internal object IDs at the **INFO** or **WARNING** level. Example: ```python logger.info(f"User {current_user.email} creating expense: {expense_in.description}") ``` Logging personally identifiable information (PII) such as email addresses at non-debug levels creates a compliance risk (GDPR/Datenschutz) and can leak data if log files are exposed. **Recommendation:** - Remove PII (emails, names) from log messages at INFO/WARNING/ERROR levels. - If user identifiers are required, use the numeric `user.id` or a hashed identifier. - Move any detailed, user‐specific logging to DEBUG level with redaction. --- ### 10. Detailed Internal Errors Returned to Clients **Severity: MEDIUM** **Problem:** Many `except` blocks convert low-level SQLAlchemy exceptions directly into HTTP 400/500 responses with `detail=str(e)`. This leaks internal table names, constraint names, and stack traces to the client. **Example:** `crud/expense.py` and other CRUD modules. **Recommendation:** - Map all internal exceptions to generic error messages for clients (e.g., "Database error, please try again"). - Log the original exception server-side (with redaction) but never expose raw error strings to API consumers. --- ### 11. Insecure Default Configuration Values **Severity: HIGH** **Problem:** `app/config.py` sets insecure defaults that may accidentally reach production: - `SESSION_SECRET_KEY = "your-session-secret-key"` – a published default secret. - `ACCESS_TOKEN_EXPIRE_MINUTES = 480` (8 h) – excessive token lifetime increases risk if a token is leaked. - Wide-open `CORS_ORIGINS` allowing any localhost ports. **Recommendation:** - Make `SESSION_SECRET_KEY` **mandatory** via env var with no default. Fail hard if missing. - Reduce default `ACCESS_TOKEN_EXPIRE_MINUTES` to ≤60 and allow override via env. - In production mode (`settings.is_production`), restrict CORS origins to the canonical frontend domain list only. --- ### 12. Missing Global Rate Limiting / Brute-Force Protection **Severity: MEDIUM** **Problem:** The API exposes authentication (JWT login) and sensitive endpoints without any rate-limiting or brute-force protection, leaving it vulnerable to password-guessing and scraping attacks. **Recommendation:** - Integrate a middleware-level rate limiter (e.g., `slowapi`, `fastapi-limiter`) backed by Redis. - Apply stricter per-IP limits on `/auth/jwt/login`, OTP verification, and password-reset endpoints. --- ### 13. Recurring Expense Job Never Commits Transactions **Severity: HIGH** **Problem:** `be/app/jobs/recurring_expenses.py` uses `db.flush()` inside `_generate_next_occurrence` but **never calls `commit()`** on the `AsyncSession`. Because the helper is executed inside the scheduler wrapper `run_recurring_expenses_job`, the surrounding context manager closes the session without an explicit commit, so Postgres rolls the transaction back. No newly-generated expenses are permanently saved and the `max_occurrences` counter is not decremented. At the next scheduler tick the exact same expenses are re-generated, flooding the database and the UI with duplicates. **Recommendation:** 1. Replace the manual `flush()` calls with a **single transactional block** (e.g. `async with db.begin(): …`) and call `await db.commit()` once per expense or after the batch loop. 2. Add an integration test that verifies a generated expense actually persists after the job finishes. 3. Consider running the scheduler job inside a **database advisory lock** to avoid multiple workers generating duplicates concurrently. --- ### 14. Blocking, Mixed-Sync Scheduler Configuration Can Starve the Event Loop **Severity: MEDIUM** **Problem:** `be/app/core/scheduler.py` converts the async database URL to a _sync_ URL (`postgresql://`) for `SQLAlchemyJobStore` **and** registers the job with `ThreadPoolExecutor`. This combination forces a blocking, synchronous DB driver (psycopg2) to execute inside FastAPI's event loop thread pool. Under load, long-running jobs will consume the limited thread pool, delay HTTP requests, and may exhaust the sync connection pool, causing 5xx errors throughout the API layer. **Recommendation:** - Use `AsyncIOJobStore` with the existing `postgresql+asyncpg` URL, or move scheduled tasks to a dedicated worker (e.g. Celery, RQ) outside the API process. - If synchronous execution is unavoidable, enlarge the job executor pool and _run the scheduler in a separate process_ so it cannot impact request latency. --- ### 15. Public Root & Health Endpoints Leak Deployment Metadata **Severity: MEDIUM** **Problem:** Unauthenticated `GET /` and `GET /health` responses include the `environment` ("development", "staging", "production") and full semantic `version` string straight from `app/config.py`. Attackers can fingerprint deployment targets and time their exploits around release cycles. **Recommendation:** - Limit environment & version details to authenticated admin users or internal monitoring networks. - Replace the public response with a minimal `{ "status": "ok" }` payload. - Add a separate, authenticated diagnostics endpoint for detailed build info. --- ### 16. Wildcard-Style CORS With `allow_credentials=True` Enables Session Hijacking **Severity: HIGH** **Problem:** `be/app/main.py` registers `CORSMiddleware` with the entire `settings.cors_origins_list` **and** sets `allow_credentials=True`, which instructs browsers to _send cookies and HTTP-Only session tokens_ to _any_ origin in that list. Because the default list contains every localhost port and is hard-coded, a malicious site running on another local port can silently read or mutate a user's data via XHR once the victim logs in to the real front-end. **Recommendation:** 1. In production, populate `CORS_ORIGINS` from env and restrict it to the canonical front-end domain(s). 2. Set `allow_credentials=False` unless cookie-based auth is absolutely required—JWTs are already used for API auth. 3. Use the `strict-origin-when-cross-origin` referrer policy and enable CSRF protection if cookies stay enabled. --- ### 17. Groups & Settlements Lack Optimistic-Locking Enforcement **Severity: MEDIUM** **Problem:** The `groups`, `settlements`, and `settlement_activities` tables include a `version` column, but **none of the corresponding endpoints check or increment that version** during `PUT`/`DELETE` operations (unlike `lists` and `items`). Concurrent updates therefore overwrite each other silently, leading to lost data and inconsistent financial history. **Recommendation:** - Mirror the pattern used in the Lists API: require clients to supply the expected `version` and return **409 Conflict** on mismatch. - Increment `version` in SQL via `UPDATE … SET version = version + 1` inside the same transaction to guarantee atomicity. --- ### 18. Redis Connection Pool Never Closed **Severity: LOW** **Problem:** `be/app/core/redis.py` creates a module-level `redis_pool` but the pool is **never closed** on application shutdown. In long-running worker processes (e.g. during Kubernetes rolling updates) this leaks file descriptors and keeps TCP connections open until the OS forcibly times them out. **Recommendation:** - Add a lifespan handler or shutdown event to call `await redis_pool.aclose()`. - Use a factory in the DI system to create and dispose Redis connections per request where feasible. ---