MysqlImport

Simple concurrent importer for MySQL using load_data_infile2.

Installation

Add to your application's Gemfile:

gem 'mysql_import'

And bundle.

Examples

Basic Usage

For exampole, if you want to import to users table from /path/to/users.csv:

db_config = {
  host: 'localhost'
  database: 'mysql_import_test'
  username: 'root'
}
importer = MysqlImport.new(db_config)
importer.add('/path/to/users.csv')
importer.import
# => Import to `users` tables

Multiple import:

importer = MysqlImport.new(db_config)
importer.add('/path/to/users.csv')
importer.add('/path/to/groups.csv')
importer.add('/path/to/departments.csv')
importer.import
# => Import to three tables from three csv files

MysqlImport has the concurrency because it uses the parallel gem.

With import options:

importer = MysqlImport.new(db_config)
importer.add('/path/to/users1.csv', table: 'users')
importer.add('/path/to/users2.csv', table: 'users')
importer.add('/path/to/users3.csv', table: 'users')
importer.import
# => Import to `users` table from three csv files

Options

#initialize

Key that can be passed to the second argument of the option of MysqlImport#initialize is the four types.

concurrency: The number of threads to use. Ruby' GIL is released when the IO waiting occurs in mysql, you might be effective by concurrent processing. (default: 2)

importer = MysqlImport.new(db_config, concurrency: 4)

log: This is an option for the logger. (default: nil)

# File path
importer = MysqlImport.new(db_config, log: '/path/to/import.log')

# nil(This is the same as `/dev/null`)
importer = MysqlImport.new(db_config, log: nil)

# STDOUT / STDERR
importer = MysqlImport.new(db_config, log: $stdout)

# Custom logger
importer = MysqlImport.new(db_config, log: CustomLogger.new)

debug: This is a flag to the level of the logger to debug. (default: false)

importer = MysqlImport.new(db_config, log: $stdout, debug: true)

sql_opts: This is the option of import to be passed directly to the second argument of LoadDataInfile2#initialize.

See more details for import options.

https://github.com/nalabjp/load_data_infile2#sql-options

#add

The second argument of MysqlImport#add will be passed directly to the second argument of LoadDataInfile2#import.

See more details for import options.

https://github.com/nalabjp/load_data_infile2#sql-options

Filter

If you want to import only a specific file, you can specify the file.

The specification of the file will be used regular expression

String

importer = MysqlImport.new(db_config)
importer.add('/path/to/users.csv')
importer.add('/path/to/groups.csv')
importer.add('/path/to/departments.csv')
importer.import('users')
# => Only import to `users` table

Array

importer.import(['users', 'groups'])
# => Import to `users` and `groups` table

If empry:

importer.import([])
# => Do not import anything

Hook

You are able to set the hook immediately before and after import.

The hook will accept either String or Proc or Array.

String

String is evaluated directly as SQL.

importer = MysqlImport.new(db_config)
importer.add(
  '/path/to/users.csv',
  {
    before: 'TRUNCATE TABLE users;'
  }
)
importer.import
# => Truncate query is always executed before import.

Proc

If you want to make the subsequent processing based on the execution result of SQL, you should use Proc.

Arguments that are passed to Proc is an instance of LoadDataInfile2::Client, which is a subclass of Mysql2::Client.

importer = MysqlImport.new(db_config)
importer.add(
  '/path/to/users.csv',
  {
    before: ->(cli) {
      res = cli.query('SELECT COUNT(*) AS c FROM users;')
      cli.query('TRUNCATE TABLE users;') if res.first['c'] > 0
    }
  }
)
importer.import
# => If there is one or more records in `users` table, truncate query is executed.

Array

Array of elements you need to use String or Proc.

importer = MysqlImport.new(db_config)
importer.add(
  '/path/to/users.csv',
  {
    before: [
      "SET sql_mode = 'STRICT_TRANS_TABLES';",
      ->(cli) {
        res = cli.query('SELECT COUNT(*) AS c FROM users;')
        cli.query('TRUNCATE TABLE users;') if res.first['c'] > 0
      }
    ],
    after: [
      'SET @i = 0;',
      'UPDATE users SET order = (@i := @i + 1) ORDER BY name, email ASC;',
    ]
  }
)
importer.import

Skip all subsequent processing

If you want to skip all subsequent processing, you will need to raise MysqlImport::Break in Proc.

importer = MysqlImport.new(db_config)
importer.add(
  '/path/to/users.csv',
  {
    before: ->(cli) {
      res = cli.query('SELECT COUNT(*) AS c FROM users;')
      raise MysqlImport::Break if res.first['c'] > 0
    },
    after: [
      'SET @i = 0;',
      'UPDATE users SET order = (@i := @i + 1) ORDER BY name, email ASC;',
    ]
  }
)
importer.import
# => If there is one or more records in `users` table, import and after hook will be skipped.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/nalabjp/mysql_import.

License

The gem is available as open source under the terms of the MIT License.

mysql_import

Development

Runtime

MysqlImport

Installation

Examples

Basic Usage

Options

#initialize

#add

Filter

String

Array

Hook

String

Proc

Array

Skip all subsequent processing

Contributing

License