254e050976
See PDF attached to the JIRA issue 1406. The contribution is my original work and I license the work to the project under the project's open source license. Author: Vincenzo Selvaggio <vselvaggio@hotmail.it> Author: Xiangrui Meng <meng@databricks.com> Author: selvinsource <vselvaggio@hotmail.it> Closes #3062 from selvinsource/mllib_pmml_model_export_SPARK-1406 and squashes the following commits: 852aac6 [Vincenzo Selvaggio] [SPARK-1406] Update JPMML version to 1.1.15 in LICENSE file 085cf42 [Vincenzo Selvaggio] [SPARK-1406] Added Double Min and Max Fixed scala style 30165c4 [Vincenzo Selvaggio] [SPARK-1406] Fixed extreme cases for logit 7a5e0ec [Vincenzo Selvaggio] [SPARK-1406] Binary classification for SVM and Logistic Regression cfcb596 [Vincenzo Selvaggio] [SPARK-1406] Throw IllegalArgumentException when exporting a multinomial logistic regression 25dce33 [Vincenzo Selvaggio] [SPARK-1406] Update code to latest pmml model dea98ca [Vincenzo Selvaggio] [SPARK-1406] Exclude transitive dependency for pmml model 66b7c12 [Vincenzo Selvaggio] [SPARK-1406] Updated pmml model lib to 1.1.15, latest Java 6 compatible a0a55f7 [Vincenzo Selvaggio] Merge pull request #2 from mengxr/SPARK-1406 3c22f79 [Xiangrui Meng] more code style e2313df [Vincenzo Selvaggio] Merge pull request #1 from mengxr/SPARK-1406 472d757 [Xiangrui Meng] fix code style 1676e15 [Vincenzo Selvaggio] fixed scala issue e2ffae8 [Vincenzo Selvaggio] fixed scala style b8823b0 [Vincenzo Selvaggio] Merge remote-tracking branch 'upstream/master' into mllib_pmml_model_export_SPARK-1406 b25bbf7 [Vincenzo Selvaggio] [SPARK-1406] Added export of pmml to distributed file system using the spark context 7a949d0 [Vincenzo Selvaggio] [SPARK-1406] Fixed scala style f46c75c [Vincenzo Selvaggio] [SPARK-1406] Added PMMLExportable to supported models 7b33b4e [Vincenzo Selvaggio] [SPARK-1406] Added a PMMLExportable interface Restructured code in a new package mllib.pmml Supported models implements the new PMMLExportable interface: LogisticRegression, SVM, KMeansModel, LinearRegression, RidgeRegression, Lasso d559ec5 [Vincenzo Selvaggio] Merge remote-tracking branch 'upstream/master' into mllib_pmml_model_export_SPARK-1406 8fe12bb [Vincenzo Selvaggio] [SPARK-1406] Adjusted logistic regression export description and target categories 03bc3a5 [Vincenzo Selvaggio] added logistic regression da2ec11 [Vincenzo Selvaggio] [SPARK-1406] added linear SVM PMML export 82f2131 [Vincenzo Selvaggio] Merge remote-tracking branch 'upstream/master' into mllib_pmml_model_export_SPARK-1406 19adf29 [Vincenzo Selvaggio] [SPARK-1406] Fixed scala style 1faf985 [Vincenzo Selvaggio] [SPARK-1406] Added target field to the regression model for completeness Adjusted unit test to deal with this change 3ae8ae5 [Vincenzo Selvaggio] [SPARK-1406] Adjusted imported order according to the guidelines c67ce81 [Vincenzo Selvaggio] Merge remote-tracking branch 'upstream/master' into mllib_pmml_model_export_SPARK-1406 78515ec [Vincenzo Selvaggio] [SPARK-1406] added pmml export for LinearRegressionModel, RidgeRegressionModel and LassoModel e29dfb9 [Vincenzo Selvaggio] removed version, by default is set to 4.2 (latest from jpmml) removed copyright ae8b993 [Vincenzo Selvaggio] updated some commented tests to use the new ModelExporter object reordered the imports df8a89e [Vincenzo Selvaggio] added pmml version to pmml model changed the copyright to spark a1b4dc3 [Vincenzo Selvaggio] updated imports 834ca44 [Vincenzo Selvaggio] reordered the import accordingly to the guidelines 349a76b [Vincenzo Selvaggio] new helper object to serialize the models to pmml format c3ef9b8 [Vincenzo Selvaggio] set it to private 6357b98 [Vincenzo Selvaggio] set it to private e1eb251 [Vincenzo Selvaggio] removed serialization part, this will be part of the ModelExporter helper object aba5ee1 [Vincenzo Selvaggio] fixed cluster export cd6c07c [Vincenzo Selvaggio] fixed scala style to run tests f75b988 [Vincenzo Selvaggio] Merge remote-tracking branch 'origin/master' into mllib_pmml_model_export_SPARK-1406 07a29bf [selvinsource] Update LICENSE 8841439 [Vincenzo Selvaggio] adjust scala style in order to compile 1433b11 [Vincenzo Selvaggio] complete suite tests 8e71b8d [Vincenzo Selvaggio] kmeans pmml export implementation 9bc494f [Vincenzo Selvaggio] added scala suite tests added saveLocalFile to ModelExport trait 226e184 [Vincenzo Selvaggio] added javadoc and export model type in case there is a need to support other types of export (not just PMML) a0e3679 [Vincenzo Selvaggio] export and pmml export traits kmeans test implementation
157 lines
5.3 KiB
XML
157 lines
5.3 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!--
|
|
~ Licensed to the Apache Software Foundation (ASF) under one or more
|
|
~ contributor license agreements. See the NOTICE file distributed with
|
|
~ this work for additional information regarding copyright ownership.
|
|
~ The ASF licenses this file to You under the Apache License, Version 2.0
|
|
~ (the "License"); you may not use this file except in compliance with
|
|
~ the License. You may obtain a copy of the License at
|
|
~
|
|
~ http://www.apache.org/licenses/LICENSE-2.0
|
|
~
|
|
~ Unless required by applicable law or agreed to in writing, software
|
|
~ distributed under the License is distributed on an "AS IS" BASIS,
|
|
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
~ See the License for the specific language governing permissions and
|
|
~ limitations under the License.
|
|
-->
|
|
|
|
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
|
|
<modelVersion>4.0.0</modelVersion>
|
|
<parent>
|
|
<groupId>org.apache.spark</groupId>
|
|
<artifactId>spark-parent_2.10</artifactId>
|
|
<version>1.4.0-SNAPSHOT</version>
|
|
<relativePath>../pom.xml</relativePath>
|
|
</parent>
|
|
|
|
<groupId>org.apache.spark</groupId>
|
|
<artifactId>spark-mllib_2.10</artifactId>
|
|
<properties>
|
|
<sbt.project.name>mllib</sbt.project.name>
|
|
</properties>
|
|
<packaging>jar</packaging>
|
|
<name>Spark Project ML Library</name>
|
|
<url>http://spark.apache.org/</url>
|
|
|
|
<dependencies>
|
|
<dependency>
|
|
<groupId>org.apache.spark</groupId>
|
|
<artifactId>spark-core_${scala.binary.version}</artifactId>
|
|
<version>${project.version}</version>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>org.apache.spark</groupId>
|
|
<artifactId>spark-streaming_${scala.binary.version}</artifactId>
|
|
<version>${project.version}</version>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>org.apache.spark</groupId>
|
|
<artifactId>spark-sql_${scala.binary.version}</artifactId>
|
|
<version>${project.version}</version>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>org.apache.spark</groupId>
|
|
<artifactId>spark-graphx_${scala.binary.version}</artifactId>
|
|
<version>${project.version}</version>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>org.jblas</groupId>
|
|
<artifactId>jblas</artifactId>
|
|
<version>${jblas.version}</version>
|
|
<scope>test</scope>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>org.scalanlp</groupId>
|
|
<artifactId>breeze_${scala.binary.version}</artifactId>
|
|
<version>0.11.2</version>
|
|
<exclusions>
|
|
<!-- This is included as a compile-scoped dependency by jtransforms, which is
|
|
a dependency of breeze. -->
|
|
<exclusion>
|
|
<groupId>junit</groupId>
|
|
<artifactId>junit</artifactId>
|
|
</exclusion>
|
|
<exclusion>
|
|
<groupId>org.apache.commons</groupId>
|
|
<artifactId>commons-math3</artifactId>
|
|
</exclusion>
|
|
</exclusions>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>org.apache.commons</groupId>
|
|
<artifactId>commons-math3</artifactId>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>org.scalacheck</groupId>
|
|
<artifactId>scalacheck_${scala.binary.version}</artifactId>
|
|
<scope>test</scope>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>junit</groupId>
|
|
<artifactId>junit</artifactId>
|
|
<scope>test</scope>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>com.novocode</groupId>
|
|
<artifactId>junit-interface</artifactId>
|
|
<scope>test</scope>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>org.mockito</groupId>
|
|
<artifactId>mockito-all</artifactId>
|
|
<scope>test</scope>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>org.apache.spark</groupId>
|
|
<artifactId>spark-streaming_${scala.binary.version}</artifactId>
|
|
<version>${project.version}</version>
|
|
<type>test-jar</type>
|
|
<scope>test</scope>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>org.jpmml</groupId>
|
|
<artifactId>pmml-model</artifactId>
|
|
<version>1.1.15</version>
|
|
<exclusions>
|
|
<exclusion>
|
|
<groupId>com.sun.xml.fastinfoset</groupId>
|
|
<artifactId>FastInfoset</artifactId>
|
|
</exclusion>
|
|
<exclusion>
|
|
<groupId>com.sun.istack</groupId>
|
|
<artifactId>istack-commons-runtime</artifactId>
|
|
</exclusion>
|
|
</exclusions>
|
|
</dependency>
|
|
</dependencies>
|
|
<profiles>
|
|
<profile>
|
|
<id>netlib-lgpl</id>
|
|
<dependencies>
|
|
<dependency>
|
|
<groupId>com.github.fommil.netlib</groupId>
|
|
<artifactId>all</artifactId>
|
|
<version>${netlib.java.version}</version>
|
|
<type>pom</type>
|
|
</dependency>
|
|
</dependencies>
|
|
</profile>
|
|
</profiles>
|
|
<build>
|
|
<outputDirectory>target/scala-${scala.binary.version}/classes</outputDirectory>
|
|
<testOutputDirectory>target/scala-${scala.binary.version}/test-classes</testOutputDirectory>
|
|
<resources>
|
|
<resource>
|
|
<directory>../python</directory>
|
|
<includes>
|
|
<include>pyspark/mllib/*.py</include>
|
|
<include>pyspark/mllib/stat/*.py</include>
|
|
<include>pyspark/ml/*.py</include>
|
|
<include>pyspark/ml/param/*.py</include>
|
|
</includes>
|
|
</resource>
|
|
</resources>
|
|
</build>
|
|
</project>
|