data-management

📁 nguyenhuuca/assessment 📅 10 days ago
9
总安装量
9
周安装量
#32137
全站排名
安装命令
npx skills add https://github.com/nguyenhuuca/assessment --skill data-management

Agent 安装分布

openclaw 9
github-copilot 9
codex 9
kimi-cli 9
gemini-cli 9
cursor 9

Skill 文档

Data Management

Workflows

  • Schema Design: Define tables, relationships, constraints
  • Migrations: Version control schema changes
  • Indexing: Add indexes for query performance
  • Backup: Ensure data recovery capability

Schema Design Principles

Normalization

  • 1NF: Atomic values, no repeating groups
  • 2NF: No partial dependencies
  • 3NF: No transitive dependencies

When to Denormalize

  • Read-heavy workloads
  • Reporting/analytics
  • Caching layers

Migration Best Practices (Liquibase + PostgreSQL)

Forward-Only Migrations

Each migration should be a single forward step.

<!-- src/main/resources/db/changelog/changes/001-create-users.xml -->
<databaseChangeLog>
  <changeSet id="001-create-users" author="dev">
    <createTable tableName="users">
      <column name="id" type="BIGSERIAL">
        <constraints primaryKey="true" nullable="false"/>
      </column>
      <column name="email" type="VARCHAR(255)">
        <constraints nullable="false" unique="true"/>
      </column>
      <column name="created_at" type="TIMESTAMP WITH TIME ZONE" defaultValueComputed="CURRENT_TIMESTAMP">
        <constraints nullable="false"/>
      </column>
    </createTable>

    <createIndex indexName="idx_users_email" tableName="users">
      <column name="email"/>
    </createIndex>
  </changeSet>
</databaseChangeLog>
# db/changelog/db.changelog-master.yaml
databaseChangeLog:
  - include:
      file: changes/001-create-users.xml
      relativeToChangelogFile: true
-- Alternative: SQL format with Liquibase
-- liquibase formatted sql

-- changeset dev:001-create-users
CREATE TABLE users (
  id BIGSERIAL PRIMARY KEY,
  email VARCHAR(255) NOT NULL UNIQUE,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_users_email ON users(email);
-- rollback DROP TABLE users;

Safe Migrations

  • Add columns as nullable first
  • Create indexes concurrently
  • Never drop columns in the same deploy

Indexing Strategy

-- B-tree (default): Equality and range queries
CREATE INDEX idx_users_email ON users(email);

-- Partial index: When you query a subset
CREATE INDEX idx_active_users ON users(id) WHERE active = true;

-- Composite index: Multiple columns
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at DESC);

Connection Management (HikariCP + Spring Boot)

# application.yaml - HikariCP is default in Spring Boot
spring:
  datasource:
    url: jdbc:postgresql://localhost:5432/funnyapp
    username: ${DB_USER}
    password: ${DB_PASS}
    hikari:
      maximum-pool-size: 20              # Max connections
      minimum-idle: 5                     # Min idle connections
      connection-timeout: 2000            # Max wait for connection (ms)
      idle-timeout: 30000                 # Close idle connections after (ms)
      max-lifetime: 1800000               # Max connection lifetime (30 min)
      pool-name: FunnyAppPool
      leak-detection-threshold: 60000     # Detect connection leaks (1 min)

  jpa:
    hibernate:
      ddl-auto: none                      # Use Liquibase instead
    properties:
      hibernate:
        default_schema: public
        format_sql: true
        jdbc:
          batch_size: 20
        order_inserts: true
        order_updates: true

  liquibase:
    change-log: classpath:db/changelog/db.changelog-master.yaml
    enabled: true
// JPA Entity with proper relationships
@Entity
@Table(name = "users", indexes = {
    @Index(name = "idx_users_email", columnList = "email")
})
@Data
@Builder
@NoArgsConstructor
@AllArgsConstructor
public class User {
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    @Column(nullable = false, unique = true)
    private String email;

    @Column(name = "created_at", nullable = false, updatable = false)
    @Builder.Default
    private Instant createdAt = Instant.now();

    @OneToMany(mappedBy = "user", cascade = CascadeType.ALL, orphanRemoval = true)
    private List<Video> videos = new ArrayList<>();
}

// Repository with proper query optimization
@Repository
public interface UserRepository extends JpaRepository<User, Long> {
    Optional<User> findByEmail(String email);

    @Query("SELECT u FROM User u LEFT JOIN FETCH u.videos WHERE u.id = :id")
    Optional<User> findByIdWithVideos(@Param("id") Long id);

    @Query(value = "SELECT * FROM users WHERE created_at > :since ORDER BY created_at DESC",
           nativeQuery = true)
    List<User> findRecentUsers(@Param("since") Instant since);
}

Data Integrity

  • Use foreign key constraints
  • Add NOT NULL where appropriate
  • Use CHECK constraints for validation
  • Consider using ENUM types for fixed values