1 Star 0 Fork 0

wangcz1988 / pentaho-kettle

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
Apache-2.0

Project Layout

  • Apache Ivy support has been added to resolve dependencies.
    • This has eliminated the need to commit JAR files into the version control system. It will also help with conflict management, to ensure all Kettle modules and plugins are using the same JARs.
  • The structure of the project has changed. What were source folders are now subprojects that can be built independently. These sub projects contain their own IVY files.
    • For example, "src-ui" has become the "ui" module. Inside the "ui" project is the src folder that was "src-ui". It also has files such as build.xml, build.properties, ivy.xml, etc.

Compiling

  1. Run ant clean-all resolve create-dot-classpath
  • These targets will resolve and retrieve the dependencies (third-party and Pentaho JARs, e.g.) and update your Eclipse classpath.
  1. Run ant dist
  • This will perform a build of the Kettle modules and core plugins, and generate a local distribution folder, dist/, which can be used to run the "Spoon" or other programs.

Notes:

  • Apache Ivy manages the creation of the .classpath file for the Eclipse project, and it is not needed, or recommended, to include this file into a pull request.
  • A copy of the ant binary is also available within most Eclipse installations and can be run through console by properly setting the path environment variable.
  • The build process requires also the Maven package (sudo apt-get install maven2 on Ubuntu Linux).

Contributing

  1. Submit a pull request, referencing the relevant Jira case
  2. Attach a Git patch file to the relevant Jira case

Use of the Pentaho checkstyle format (via ant checkstyle and reviewing the report) and developing working Unit Tests helps to ensure that pull requests for bugs and improvements are processed quickly.

FAQ

How do I set up Run and Debug configurations in Eclipse?

Running the "create-dot-classpath" Ant target will create a launch configuration (using the template provided by project.launch) named using your project folder name, such as kettle-trunk.launch, and it will place the .launch file in the projects root folder.

Restarting Eclipse will make it available in the Run/Run Configurations... and Run/Debug configurations... drop-down menus. The launch configuration is available without restarting Eclipse by right-clicking on the .launch file and selecting "Run As..." then the name of the project.

Let's say I just want to add a new property to a step using Eclipse as my IDE. What do I have to do?

  1. Check out the project and set it up as an Eclipse Java project.
  2. Run "ant clean-all resolve create-dot-classpath"
  3. Refresh the Eclipse project to synch the workspace with the file system.
  4. Make the appropriate code changes in the step meta and the step dialog.
  5. Run the default Ant target
  6. Changes can be verified by running the .launch file where is the name of the Eclipse project.

If I want to build the project with Ant should I always use the default target?

To simply build/compile the code, use the default target. To get a full Kettle distribution, use the "dist" Ant target. To build the distribution (or any module or plugin) from a clean workspace, run the following Ant targets from the root directory of the desired artifact:

ant clean-all resolve dist

My code changes were just in the engine module. Can I run Ant from there?

You can use the build file located in the engine folder, e.g.,

cd engine
ant clean-all resolve dist

I get compile errors! Cannot find symbols and packages that don't exist!

When you did that default build from the projects root folder you resolved dependencies into its lib folder. You need to resolve engine's dependencies and then compile:

cd engine
ant resolve compile

That seems redundant.

Yes but we are building modules now. If your Ivy cache already contains the dependencies, the resolve should be fairly quick.

I ran Spoon from the project's dist folder. Why can't I see my changes I just compiled?

You need to do a an "ant dist" at the project level.

Here is an example of compiling engine source and then "disting" the project:

cd engine
ant compile

No compile errors!

cd ..
ant dist

No errors!

cd dist
sh spoon.sh

Changes should be reflected in Spoon!

If I needed to change something in DB, like the default port for PostgreSQL, do I need to check out all of Kettle and build it?

You will get a full working copy of Kettle when you checkout a branch from the Git project. However you do not need to build all of Kettle if your changes are isolated to a particular module or plugin. In this example you can go into the "core" folder and run the following Ant target set:

clean-all resolve dist

A kettle-core JAR will be built and placed in the project's dist/ folder.

To test out your changes you can grab a Kettle build from CI: http://ci.pentaho.com/view/Data%20Integration/job/Kettle/

Replace the kettle-core jar in the CI build's lib/ folder and run Spoon. Create a new DB connection with PostgreSQL as the connection type. You should see your new default port number.

Why does the build output appear to download JARs multiple times?

 Although it may _appear_ to be downloading JARs multiple times, Ivy will download the dependencies _once_ and cache them (in your home folder 
 under .ivy2/cache) for later use. When a dependency is being downloaded you will see multiple periods displayed (for example: ".........").
 While running the default Ant target (or the "create-dot-classpath" target), Ivy will resolve the dependencies for each Kettle module and core plugin.
 It does this by first checking the local repository, then your local cache, then other public repositories. Ivy will output a line for each resolved 
 dependency, but that does not mean the artifact is being downloaded. Rather, Ivy is checking to see if the artifact is already present locally and if so,
 will use it. Therefore, you may see lots of lines in the Ant output for Ivy resolve tasks, but if you don't see the periods, then the artifacts already
 exist locally and will not be downloaded again.

After checking out Kettle for the first time, why does the first build take so long?

This is an effect of the use of Ivy for dependency management. Instead of the checkout itself taking a long time (as all JARs used to be checked into version control), instead the initial checkout should be much faster but the first build will be much longer. This is due to Ivy downloading all the dependencies to its local cache. You should see significant improvement in the time it takes to build every time after that.

I seem to be getting Ivy-related errors while running Ant targets. What should I do?

It is possible that your Ivy cache has become corrupt. If you know which dependencies seem to be causing the issue, you can go to the Ivy cache (under your home folder at .ivy2/cache), find the folder containing the artifact(s), delete the folder, then re-run your Ant target. If this does not work, you can run the "ivy-clean-cache" and "ivy-clean-local" Ant tasks to clean your entire Ivy cache and local repository, respectively.

I removed the right directories from the ~/.ivy2/cache directory but I am still having ivy issues.

If you are seeing error messages like:

    [ivy:resolve] :: problems summary ::
[ivy:resolve] :::: WARNINGS
[ivy:resolve]     module not found: pentaho-kettle#kettle-db;TRUNK-SNAPSHOT
[ivy:resolve]   ==== local: tried
[ivy:resolve]     /home/rbouman/.ivy2/local/pentaho-kettle/kettle-db/TRUNK-SNAPSHOT/ivys/ivy.xml

then your local ivy files (in .ivy2/local) is trying to pull in a jar that is no longer available (and probably, no longer needed). To remedy this, remove the entire .ivy2/local directory and retry.

I'm making a change to Kettle that requires a new (or newer version of a) third-party library or dependency. What do I do?

No JAR files should be committed to the Kettle project. Instead, locate the ivy.xml file in the module or core plugin folder that contains your code changes, and find the tag that refers to the dependency you'd like to update. If the dependency exists, simply update the revision and run the "resolve" Ant target. If the dependency tag for an existing JAR is not present in the ivy.xml file, it is likely being brought in "transitively" by a dependency on some other Kettle or Pentaho module. In this case, for development you can add the Ivy dependency to the file manually and run the "resolve" Ant target. However, rather than committing the change to ivy.xml, please write a Jira case asking for the update of the desired dependencies. This will allow Pentaho to ensure that updating the dependencies won't interfere with other modules that use the same JARs.

If a new dependency is needed, simply add the dependency to the appropriate ivy.xml file and commit with descriptive comments.

IMPORTANT: If a new dependency (JAR) is being introduced, make sure the license is not GPL or AGPL. These licenses are not "Pentaho-friendly" and we cannot distribute these JARs without all Kettle source code becoming GPL. LGPL licensing is ok for JARs but not for code. The most "Pentaho-friendly" licenses are permissive licenses such as Apache or MIT. If you have any questions about licensing, please contact Pentaho.

What is that "assembly" folder?

The assembly folder serves two purposes:

  1. It provides a staging area for building Kettle.
  2. It contains resources needed for a Kettle distribution. The resources are contained in the "package-res" folder.

What is "package-res" in assembly?

If you take a look in "package-res" you will see a folder structure that once was under the root of the Kettle project. These folders are packaged up into the distributable product.

Changes to shell scripts, launcher, images, and docs are made here.

Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. APACHE HADOOP SUBCOMPONENTS: The Apache Hadoop project contains subcomponents with separate copyright notices and license terms. Your use of the source code for the these subcomponents is subject to the terms and conditions of the following licenses. For the org.apache.hadoop.util.bloom.* classes: /** * * Copyright (c) 2005, European Commission project OneLab under contract * 034819 (http://www.one-lab.org) * All rights reserved. * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * - Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * - Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the distribution. * - Neither the name of the University Catholique de Louvain - UCL * nor the names of its contributors may be used to endorse or * promote products derived from this software without specific prior * written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE * COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN * ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGE. */

简介

test1 展开 收起
Java
Apache-2.0
取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多
不能加载更多了
Java
1
https://gitee.com/wangcz1988/pentaho-kettle.git
git@gitee.com:wangcz1988/pentaho-kettle.git
wangcz1988
pentaho-kettle
pentaho-kettle
master

搜索帮助